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PANDOM AND SYKTKMATIC APRANUDMENT.S 

Hv IIAMOLD JKFKHKVS. FJl.H. 


1 itwK with Ph' rccf'jit JtHi'tiKHiuiw tii'fhiH tjUCKtitm, tlutiigh 1 

rsnoftt rlrtiiii [M'rHuijiJ laittHicJtn- «if fhi' fyiM'.s 'tf i»r(iltl{‘in where (iics iiropofteJ 
arraH|f(‘ntent« ure nneJ, ,My jter^MiJui feeliuR, as a .seisiinthij^ifst ({ealiiig with 
natural earlh'jnakf'H, is (atr ni'envy nf wtirkers iit ssiiijefts wlim! it is pOHsiUe 
!«♦ fiesiitri e\|H*ri!H(‘ut}s at all. ajui t's]*ci-ially wie-r** ti>t*y can hi! (ksigtiwl iii Bueh 
a Hay lhaf the unrmn} fffiwtiuns for fhc rnrimit^ iinuttti'Ura to he estimate! tri!! 

no crimH terms. Never! hele^s, 1 lliiuk tlift!' wmeeriiing 

ex|«’ri!!wn}al Jesign an* in riaui'er of Steing »>vci'!i Hiked even in the forUiimte 
rndtjedn when* di-sUn L [iMhAhh'. in narPrular, tin* wnni ‘‘rftndoiniit'Ha’* apfwars 
to he H'.ed in f h»» .--en-e*'.’, wlijrh are not (‘tjuivaicnf . To take an illurtration from 
rii'i'-nn,' ..iH vry, tiu' f \5.>*5i5iien! w.iulrl ill the dim-iiarge of an nxjilosion 

at it hit' (H 11 jiMint ami the ret urdiije «(( the fiiiies nf f raiwiiiiwioij of elastic waves 
tiifnnnh fi*e ynsund to a set iif reeunlers nf known dislfuin's. The ehoiec ofthn 
djsfiusnw is liar! of the th'»is.'» tj the exiteiiiuenf , the proldem is to find a pair nf 
jmniine(rr« and h *«« aeem.ateiy a« jwtsMhle fr"»i tihwrvations of (lie, times t, at 
disfaiHT* ’{'hiH wiil he flniie hy the mefhiHl of least mjimre.s. the tsijuatinna of 
winditiim Iw'inu ,, ^ ... 


Xhw ilie ijnei.tiMu nt tie.mgii Will he, li»nv .Hliuiild the arhitrary dwtanees .r, hn 
rh««*«! hi |^rv^rtI< e thi* jjitrsv.iL are ii;oia!}) nnide jw nearly as }Hj?<Hil)le, 
so that there ill. a high degu-oi.t Hy>tem m tiie design. Hut an advocate of exlittme 
r«»id<t)(i«r.,« in design wnnh! »»itjmrenSly -wjgji all the distonees hy Tippett'# 
nunif’HTs «»r swine analogt.iw (.levjee, X jnuihdde ruimnutenec of this wonld he 
that they 'iViHiid ,id! I«* jsearly espial and it wenhi k* ii)i{«»»ihle to determine 
either « nrh external rvidete-e eun(,«»njin}t the, value of t lie other- Or they 

tniRhl ail iw* wmvntrated near twi* valties, sheno and h rtmlii he determined, but 
there wmdi't Ik* tiiiMne.an.f«i.il je-»!}itgw'i»*!!n<r Pie tone of tran»irin»mn iaadequatoly 
rn|irr«i*<ntwi hy « linear hsrnnik there «rr ea«*#i in prart-iw where ii gfiu&re or a, 
cute t«rtn niaahi fa* ijsrfnd«'**i. thumgh Pm linear form » rwually Thu* 

ra»d,Hiiifie* in d«»igi* may n» rimiunitaniw !»ci to the low of information 
thill WKtild la* provided hv a -sysletnrttic dt*sigit. 

tkt ihf I’tlift iiinid }t n» ef3jiin»«l that a nystoriiatie design may Iwd to an 
tnndewfitnati* **!" tlm univrSaHity, -and P w Imfv timi the -Mahignity of the u» 
of “ ramionit'iew rnsr-ra. Tin* vwhdisy t»f the methwl »f least sitjaa,n*« n»ts on Pin 
hy|«sf ti«d» «f wtidwHiiHw in the mm> that iwrametera o t,rtd & exiist inch that, 
yiV'im IliUr vAlnea .iiinl ilse slandAstl **rjxir, the pr«h*.bihtl« of Pife rfaiduilt atftit 
the dwtitni** usfKKi mt indt'|wmbni. Thi« wiv l»e irne without tho distAnceB 
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ness of design will then'perniit «'veral observatirw t« nrar tlw » , 

and the probabilities of the rc««kialH. given llw d»i«ww. * ill iv4, Iip 
If in an actual case the varmthm of level vm liarinmur. wnl the rr» (.tTlea »«rf al 
uniform intervals so as always to l>e m the itCTts, fherr w«wM i»’ » M sifWMfit* 
error in the least squares sohitinn; Imf this «uj in this jimbfem t» «n4 

allowed for. In practice, however, variations of level are ifTftfUtl*?. n»rf 

independence of the emiw is IxtHl achicvwl Viy uaing wnifomi tnferval* » m to 
minimize the increase of unrertainly d«e to |Hw»ihk* wsmlfttiotw 
neighbouring distances. Th w draw not exclude t he {K»wibilit y I IiaI « rkwir «ii|ir«3Xl • 
mation might be attempted by including a term prHjiorfional to the tipigfil in 
the law assumed; but that would Htill not elijHiimi** iww^ihlr* elfwi#: wf 111 ?#! 
variation in the velocities immediately ladow the n'C4»rdit»g and 

the systematic design would sllll apjiroximate to the wmrtitkm of wdcjK'ndesiw 
of the errors better than the. random one would. 

My excuse for mentioning tliis prohlem in s mainly hiolr^teal lonrmil in tJrrt 
it is one that I am fairly familiar with, and that it stvnw to j]!i)«fra!i* two fmjnf« 
that have not been adequately analyatsl in the luologii’ul iiterafuttc 'fhe lirwt w 
the absolute necessity of distinguishing lielwrvn llie {irobaluhtivf* <4 tht* *sme 
proposition assessed on different data. Before designing tlu* slsive t"ii|m’rnnenl 
we may have only the vague knowledge ofo and h suggrwted by sitmlar i‘Xfwrt 
ments elsewhere. If wo are to make a systematit’ design «»• rlo know wuiirthing 
about what the relations between the intervals will be; if we are to tmte a, mwlam 
one we do not. But in either ease, as soon fta the esperiment i* de*i««Kl, the 
distanoes are definitely known in both cr»«, and are not random any hmger 
The knowledge of the poaaihle tyjtea of design that might luive Uvtt »wki!|»l«i k no 
longer relevant in either case, since we know the [mrticulttr design that l«*rn 
adopted; it is now part of the data of the problem. Tim estimatw will in mflicr 
case be made using the actual distances m data, not wimc aggregair t4 the 
possible distances that might have occurred with that mcllnsl of dmgn Tbi» 
hypothesis of the randomnesR of the reKiduals, which k nwded b.r the valubiy wf 
the method of least squares, has nothing U) do. mtrirwMlly. with tlm iatonded 
randomness of the original design; in this css® the latter mndytiitwi* w»iil4 
in practice often vitiate the validity of the former, while in any Hu?rpa*ing 
the uncertainty of the estimates. 

The second point is that any method of estimation pj^uppawes a law of error 
or chance containing adjustable parameters, such that If th®» ware known tlw 
probabilities of various experimental results are asaiped—itt other word*, Ih&t 
the- likelihood, given these paxameters, has a definite mlue. Mow I maintain 
t at to is always what logicians call a cmM^d propotltion, not an mmM 
one there is no proof in any actual case that no pemmetere otor than tilM® » 
tar thought of will ever be required; any theory of mpific&noe tMto wntompktw 

e possi r y t at others may be needed. But if a new pawmeter, whane v iIm 
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iH n«4 )’#>! ijiiiv lie wlpvant, then the law of error, given those already 

ronsiderJHt, is nni uniqtie and the likelihtHKl luia not a definite value. In the 
*iiinnr mirvey, we ftr» not aanert that tlte rekition’ iictween distance and time of 
travel in Jinrar; we ineifly consider it. and use it until we find some definite 
fvtdtuHT against if in a {tarfieulnr raw!. But wo need tlie information that will 
find {uit if it iff wrong; and tluH nniat He dtjiu.^ hy making Hure. that the presenoft 
of Idglior |K»Wf*r« ran he teated. Tins is ilono hy the systematic design. The random 
denign wotdd ioavo it to rhftn«w. and abandon the attempt to test tiie law if it 
slirmld give an arratigmiont with ail the distances concentrated near two values. 
It would pleamnt to luj able to say tiiat wo already know what parameters 
ran !«• rdevant : Iml wo have to fawi the fatd that wo do not. Wliat wo can do ia to 
pWivirlR tt prftmhire for testing them as tliey arise; l)Ut this neccs.aarily implies 
that we mtist protwd from the hypothesis involving fewer adjustable parameters 
to tin* one involving more. 'Flu* hypotheses that a limited mimher of parameters 
i« udw}unit4< must he mmithml in any case, hut that is not the same as saying 
that no otliers will ever he needed. Kstimates, however, are always subject to 
the hyjKithesis that no parameterH rUher than those exjdicitly considered are 
relevant. There aptwarH to l>e no escaint fnun this dilemma; wc can only admit it, 
slate the hyptitlitssis ex{)licttly, ami aay that the residts are the best that we can 
give in the actnal state rtf our knowledge at tlie time. ( hnfidence in it will depend 
mainly wi tlie failims to find evidence for other parameters tliat might conceivably 
have bJHJn relevant to the oliser vat ions already avaihible, 

1-et US now consider how tins may apply to jinibleniH of samiding a population 
and the design of a.gricultural cx{>eriment'«. In t he former ease the population 
of several different tyftos, of numlmrs I'l, .... total r. The numbers 
of these, tyjaw in our sample are ai.ng, total n. It is notin our power to 
change the iv, and practical considerationa may fix n. But ... are at our 
dispoRal, T)je question is, how should they be cliosenl Neyraau (1934) has shown 

that if the standard errors of the resfiective tyi>e8 are and the means 

fomui for the typos are ay, then provided tiiat the r,. are known the most accurate 
iwtimato of the mean for the entire population would be got by taking the n,. 
in proportion to lyo'^, and the estimate aa If all the cr,. are ecpial, the 

should be taken in proportion to the v,., and the heat estimate of the mean for the 
population will be the mean over the Hamplo, Sn^xjn, irrespective of tliodider- 
ences fast wetm different ty|>es, The standard error also is independent of those 
differences. It reduces simply to a matter of estimating the mean for each typo 
from asample of that type, and ite uncertainty is determined only by the variation 
within tyifes; and the mean for the whole population is the mean of the type 
meatiB, 'sveighted in proportion to the v,, which are in proportion to the 
Now this ia a highly systematic method of sampling. The random method would 
be to choose n individuals from the entire population by selecting from a directory 
by some set of “random numbers". The numbers of the types in the sample 
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would tl«m tlojwirf (nm i.fop.rlintwIiH". .^wnr-ririi; -.rf • t »• 

it will have an additiujial rm.r «n}’»on 4 n»« *4 llw Ir. -r;, i-r' -i.i< 

of th(! tyi'ie !litTorcni*f*.H awl ih' ‘'*^'**'* 

error of tlio rraulf «» rdftainr.l xt* «-».TTf'dlv if^so ♦.!«' n,-mi 

variatioi! within tiio m«iii|« 1 o hm is wl 1 1 *^’ l»S'a« Sts.»n sha? f!-- 

mean of a jiroi«)rti«tnal wsnjA’ If ihr mtmn i 4 siw wmi'j*- ».;*< 

our estimate of the mean of the piiMklioji, wrigi.'iioj,: thr tvi'*-- ri*r«:r» a-- 

cording to we weight them acrordinK ?«* i,. we nr-.ivrr ssr. f^siiustr i'ir*^ 
ia independent of (he varkfioji of (he ?>|*«' no'^H- and -f’' tww a i Hd,--? 

accuracy, aiiiirojudiiitg that for the {iropoftjunal -<*501 i^' 

Toapply thianietlnxi it iHiirremary Uiat the >, rtin.iw: f^e li; “e- ■' -.cf 

uakm)wii the only '«iy of eatiuniting them wemhi fw’ i»y ■!• !.'. ■f-;-*’ n -■ 

from the. whale jiojinlatitiu, and their inf»l nt’d^hle valmn fw xn I!;*- j-' 

of live !!,. Then the mesiu of the ertunde wot» 5 <! lie the htit miiirciite >4 th*- p-n’O 
lation mean, and itaHtsiudard error would ln' «s>ire<'d 5 f< oivl tr.-ei tfo' 1 Aroifi. >i 
in the whole Humple. The |Htini w that inform ;d von >ti<«en ti,«’ t. *'> ius* • tti*- 
estimate of the iKijnilalion mean avid jt.« atmndiud •wror, gw-r, *i,, ». 

this will he ovir actual [Kwiiion, winee mir jtrohlrtn « m priv ! e- <■ t» ■•>..»»( inU' tmo 
about the populatimi given the wiTiijde. If «e d<« noi ke-'.* v«,*' 
numbers apart from wdwi the twunple t-mi tell un iiS«tui ftw-s'n thef« e» »«* nii*re 
be said; but if we do know them and omit ti.i ij.m* shfin »»• «*re 4 iv,fsr 4 »iijg 
information and introducing avoidable ermr into Um rotmwii*' <4 sbr me«> I'i ■ 
conclusions drawn from a given sample are mu the same if i hr r, are p'-ut ? .t 
the data as they are if the ly are unknown. 

On the other hand it is claimed that the melinsl of rwi'lnfa Mmeir'.,* i** 


unbiased, Now the word “bias" also la iw nipable i4 M:>ver»l r-*- 
tations. If the standard deviations within tyjais arc fbflejf'n! . hot ii,rir 3jio!J»f'r» 
are known, it has btien seven that the tet latimate will !»» edUflitjcl f . v debic'? .utel v 
introducing a departure from proianiionahty into the ssmpling (sisd I- -r 

it afterwards by weighting accorrling to rv- might l*i mid, n*^? uitlairH . tlvsit 
any procedure that gives an estimate different from th® l**i ii!»» tnang tfr v, hrn’k? 
of the available data is biased. It is true that tantlom »mplinit at ilir iroi,!*! tsi 
designed to give every possible sample of given mm an iipia! r|»«*, . 4 t !.. 
actual sample; but wbervthe sample has actually lx»n tnkm. t’mi! w ili*" “'tiait h- 
and the intentiona of the designer are no longer rslevaal., It ftiaiSi ^h 1* 4* > i 
not in fact proportional we are entitled to allow for the tl»|»arf i»f#> «ii4 %bjMi g.'i* li 
accuracy by doing so. The fact that the use of the Muapl# mfaii tint 
might give an error with either sign in a future experiment dws noi jfijply fb j| 
in a particular case it is equally likely bo have given one with citbar mimr. Ii h 
sometimes said that by taking a random sample m avoid tkti|»-r» !l.Ht mdii 
arise through a systematic sample not being in proportfoa. Tlsus Yalo * gndilt 
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equally often he iiitTO(l««JM one nyatmatir rwtrnin.r*. slw'ii hr ft|s|4if# 
to each square separately he iutrcaivu^ another, vhm Ih' rmhrm is 'm-rm j»»l 
once in every row and every eohimn he intro*iuec« otherp Jhr rowi|.lete 
procedure has never, I think, Iwen ado}»te<l in ««rh « ewn*. it mm I»p i*'m 
to have even been e(UiKiticre<l; hut stafeiuent of it «how# hm% l»r F»!t*>r hw 
gone in the diredhm advocated by "Student", Thn kttw Aw w f#* t |»»!nt 
that the Latin square is both random and halanmi. hut llw two |*y«r#riiw irfrr 
to different features <.>f the design. Hie »y«tenmti«- teaturw mmhh tliu rr«.xiiwio» 
accuracy to be obtained for each VRried.y. lamaistently with Mnjfwriwly, lint I 
think that some further attention should la* given {,« the of thr- 

izing of rows and ctduinns tiiat in carried mit the nifiin fcafnt'** »4 ll»*’ 
Latin square tlesign are Hxed. There would iw no |«dnt in <Iih if tfic pnihaiu'ljljr# 
of the plot errors, given the varietal, row and column value#, » err all nult’iwmilrni 
The hypothesis rot[uifed liy tiie metluKl of least squarw would Ir 
the analysis of variant is ecjuivident to the rnetlnal of least apjilir^i t« # 

system where the normal equalionK contain only diagonal tfn the other 

hand a systematic ground effect that upsatte the indejiendenre <d the erttni*, w ««14 
require the explicit introduction of a new parameter to espn‘«« it . thw n**uld ««»i 
necessarily enter orthogonally to tlie others, and t he analysis of v«n«n«->e aiwlyaiw 
would have to be replaced by a dotaihaJ aohition liy Icaat miimrm Mow thw 
absence of such berms eannol be gnaranUHal. To illuatrate how tliry rait enter, 
let us suppose axes of position taken parallel to the «idw. with the «*ntw of the 
square as origin. Then if {x, y) are the coordinate* of the wiUtt* of a 
first that the fertility may be expressed in the form 

R = ao + aiX + biy-iagX* + ^i!R+ ... 4 ^, 1 /*. 

The elimination of rows and columns, in a 5 x 8 aituaw, allow* for and etiwinatess 
all powers of x and i/ separately up to ad and y*: evidently any of wiw and 
column totals could be fitted by ehoosing the nine cotiflidenta «;tt»tab!y. iiul 
neither the design of the square nor the analysis of variani« lix-hnique wowirl 
deal with a term like xy, which might be pn^nt. Whatever it« may |w. 

it will contribute nothing to the row and column totals; hut I'rg for all plot# «.f a 
particular variety will not in general vanish.' -indeed it is jawwihte hif ry iti have 
the same sign for four of them, vanislilng for the other. Thu* the pwwmvs «f 
such a term will make a contribution to the estimate* of the varietal dtlforenc** 
unless provision is made against it. Now this is not merely a thiwretitsaj danger. 
If terns in x® and y* are relevant it will be only for one porticnlar pir of dlwctiofi* 
of the sides that the coefficient of xy in the fertility will vaniah, mi il wii*i be 
presumed that if there is reason to attend to x* and i/ we should also attend to , 
The most accurate way of doing so would bo to introdue© it explicitly into the 
equations of condition and estimate it by the method of feast aquawi. In wti- 
inatmg the uncertainty allowance would have to be made for the feet that « 
a 1 lonal parameter has been estimated for every sqmre. Bui alnie it will not 
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ent^'r orilKigutuilly to the viirietal difiereiiees the labotir of calculation would be 
gwfttiy inrre.iMiwl. Again, if and f are relevant, there is reason to expect 

ttfid rf to be, and m on. The analyais would become prohibitively difficult. 

Ntw this ia where the u»b of randomization comes in, In a single 6x5 Latin 
KC|«»rf» it k imjatwible for Irij tfi vanish for every variety. (I do not know whether 
it would he {Hwible for larger atpiares, hut even if it is, higher product terms 
WHuhl rtupiire attention, j But it is possible to arrange tlie design so that, without 
evaluating the eontrihutions fttun xy and analogous terms explicitly, they will 
wmtrihnte imieiannlently to the totals, 'riiis is done by repeating the square, 
using {«‘{w,rato acts of randoinizatitin for each scpiare. It is important to notice 
that tluj object wotdd not he achieved if the iiiat stpuirc was simply copied. If 
this wtt« done, the values of I'xy for each treatment in one square, and for one 
treatment in each of the others, would determine the values for all the other 
treatmenta in the ot her two squares, and the hypothesis of independence would be 
w-rnng. The sfquirate randninizatioiis ensure that, with three replications, the 
citntrihntions of the xy and similar terms to the varietal totals are each the sum 
of thrcHE* inde|)endent nunponeuts. OniiHsiun to evaluate them explicitly may 
Merifiw information that could pussihly he recovered, and thereby lead to loss of 
aecuras^y in wtimating the main efTects; but so far m that is a criticism it does not 
apply hert! to the randomization l>ut to the analysis of variance technique, 
winch will Im valid in thest?. circunialanceH only if some artiiieial device is intro- 
ducttl t<» convert the. systematic disturbance into one that can legitimately be 
treated a* ramhun. The real justilicatimi of using this technique rather than a 
wmidete leaat squares solutit)n, taking intt» iuaatunt xij and possibly higher 
pnxhict temw, is pract ical convenience. 'I’ln? justification of the separate randomi* 
mtion in relation to it i« tliat it prevents differeneca from lining interpreted os due 
to varietal or treatment differences when they arc really due to the accumulation 
of n^leeted grouml effeets, 'Fo advocate a least squares solution for the neglected 
teimw may well be aunlogouH to saying that an investigator should carry out all 
computationa to »even figures when he knows quite well that the second will be 
unpertnin in any ease. 

It ««nw to mi that Fisher sums up the mtuatitm very well in his advice to 
balance or eflminate the larger systematio effeete os accurately as possible and 
random» the roat. It provides against two dangere: first, that the main object of 
the work may be lost by an imjmrrect estimate of the larger effeote, especially 
througii great departures of the normal equations from orthogonality; and 
secondly, that a neglected term, small in any one observation, may mount up 
when many are combined. But what is worth balancing, what is worth random- 
izing, at the cost of a small increase of uncertainty, but not worth definitely 
eliminating, and what is random anyhow, must be a matter of the particular 
problem. The only rule is to attend to actual conditions and the types of syste- 
matic variation likely to arise in them. I should see no point, for instance, in an 
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elaborate aamplbig (jf people by a tisesl lint of mndmi immlmn »■!«’« # Iip? in 
alphabetical order is available; if imiforia iotorvjilj* m «H'b a i»l ««■*< ,, 

random sample in the wmso of the indepmuk'ime t»f the vinm I do iiol »«« h$i 
will, The only Hyatematie variation that mmhl Idas the meati «oiil4 
city in terms of the jamitiou iti the tablo, the wttvr-kngth kmii,; mt pxa^i •ob 
multiple of the inUTval ehoHOSt; and this mmm t«> neriiok a jKiwdnisly l« «wl 
much consideration. But a Bimilar diflimilty might mm m rfekUms 
experimental layout mentioned by “Student' . naiiwdj’ 

A li It A A B B . 4 . 

Here, m "Student" indeed rcmarka, the cmtiniRted dilfr-rem-r kdwcsi'n /I and II 
would be bhiaed if tliere was a liarinotii<! variation t>f fertility whm' |>C'rjo«l sn 
odd multiple of the width of a ([uarlet. He disntit<w‘« a jkiI parlu wlsriy 
likely oecurrence. But laud is often ridged in just this way, asid sti Hiiw^ry 
designer might easily lay out bia exjieriimmt in surli a way that the »lsftnvn»T 
sought could not be aeparated from the jairhalie gnmnd effiKi . In iwii’li land I 
should say that the harmonic etfeet in likely to lie the dominiint gtv»un«! eflrr! 
and that the best treatment would Im to ititroduoe it ext»licil!y into the ffpsauonsi 
of condition, and lay out the experiment so that t he poHsible phases are im»f«rmly 
distributed both for ri and B, 

In the 5x6 Latin square, there is no logiml rcuison w hy we ahonjd liefofi* 
an expression of possible ground effects by an expresHion of iwmity‘fi%‘i> t^rnw 
with adjustable coefficients; fclwu tiiere would be no means left for wpamllnf 
treatment effects from ground ones. Equally there neeniB to Iw none for iwt 
treating the whole variation in terms of block mean, timtuwrito, and mndorw 
error. The place to draw the line between these two extreme mnrwt 1# d«-idf*«l 
by pure theory; it is indicated by previous oximrientse, which h«« indjmtoil His! 
some types of ground effect are habitually significant, others ueca#ionitlly 
significant, and others legitimately treated as random. What FWmr's mndnni. 
ization does is really to combine members of the second class in such a way tl«t 
they may be inelndod in the third. But extreme randomization would iitw’lry to 
randomize the first class, and the disadvantages of this gimm for to owtwcdgh iiiy 
possible advantage. 

However, I must now return to a subject where the normal ari’ 

never orthogonal; where the normal law of error never holdf; and where it j* 
impossible to separate the ground effects in different regioria from }io«tli}B 
systematic differences between the observers inhabiting thoio regions. 
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A X(ITE OX XOKMAI. CORRELATION* 


Bv K. ,l Vr. PITMAN 
rnhwmUf nf Tmmnia 


I s i reifiii j Fimify 1 1 !>3K| dkcusiRed tho distribution of the ratio of ostimates 

of fill* i»»t vfiriiitirt's, in a mmjilcj from a nonnai l>i-variafce population, and showed 
lati.v a fi'-Ht for sigiiilirants be aj>pliefl wlien the population correlation 
cHftliripiil in known. He aiao ahowed how the test may he adapted when only a 
of thin correlation is available, hut this adaptation is not com- 
pletely ftatirfactory. This note «howK how, by using a dilFerent distribution, an 
exact t<»t for significance nui be (tl)tained, and fiducial limits for tho ratio of 
the fMjpuiaf i'Ui variancca dct<>nMined, when the jtfipulalion correlation coefficient 
if« unkuuM'H. 

SuppitM* that /, If art* nttrmally ciirrelated variaijles with probability function 


1 


/ 





la* - (X ■- a) (?/-OT | 

.■ (T| jj’ 


and that a?,, (a « 1, 2, ..., ?!) 

are n pairs of observed values of x, t/. Write 

X a Ex^ln, y = Eyjn, 


{y,~y) 


Since 


2(1 




trf 


tr, (T, 


1“3 


at j 


1 


" 1 

\x~-a y~b] 

* 1 

4k ,.,,,,.,nni-rr- ■ 

[;s-a y-b] 

f” 


1 (T, fTj J 

2(1 

fo-i J 



it is ivident that — ■ 

O'! O’! 0*1 or* 

are mkpendent normal variables with variances 2{l+p), 2{l~p) respectively. 


Thus, if 


u. 


4.ifl « - E»_ll* 
n' '■“'o-i < 




the n pairs of numbers %, constitute a sample from a normal bi-variate 

* [This pafaT and that which follows by W, A. Morgan were received for publication at about 
the same date. Tho subject of both was suggested by D. J. Finney’s paper, but the two treat 
ments and applications of the improved solution of his problem, when p is unknown, follow quite 
different lines, Ed.] 



10 


Noli^ on Normnl Correlalion 


population with zero correlation. Henn'e the i.li»tnhwlion of their «*rK'laiw»n 


coefiieient, 
is known. 


£(«,-«} O’* -e) 
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.S'l 
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.4rS^ 




(rfcr|; 


Putting 
we have 


w^(r|/4 w^^V*. 




tu- w 


*/{(«> + w)®-"4r%'w|' 

The valuea of w and r are given by tlie aamjile, ht*iu*e, hmt tiw known «U« 
tribution of E, fiducial limita for m can be dctermiriod, If we tnerely w wft to l«*t 
whether the values of (S\ and E\ are significantly diffeituit, i.c. if «e wiali t« tewt 
the null hypothesis w = 1 , wo simply insert this value t)f w anti {e«t for signiJii'itiire 
the corresponding value of the correlation tmeffident E. In tlelermining Wtiwt 
limits for oj the arithmetic is a little easier if we make use of ‘‘Hiudent s-4i«« 
tribution, or Pisher’s i-distribution. We know that ha« it |i{|, |« “ il| tlw- 
tribution, and that 

E w-m 

is distributed like “ Student’s” 2 for a sample of n- 1, white 

t « MtzB » tz^yt:3 

ii») ^{4:{ i - r*) ie«| 

is distributed like Fisher’s t with n~ 2 degrees of freedom. 

If a is any given number less than 1 , wo can determine k «ueh that 

?{[ { I g a a. 

The inequality j t j ^ fc is equivalent to 


<u*-«2K’u;(t>+w)*§0, 
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whew 


A’« 


, ^ 2i\ 

I ‘^■‘ * * 

n-2 


hem* PMK^^^iK^~l))£(aSw{Ki-^{K^~l}j}^a, 

In the example given hy Finney (1H3H, j), H)2), 


n 


w* 


.Vj ^ (5'2(H}p 
.S’j “ (4>?¥6)*’ 


r « 0*878. 


KnJwtittifiiig thtw valtieH in the exjireminn for i, and putting w = 1, we obtain 
I 2*«!I3. An n is large the distribution of t ia approximately normal with zero 
inran and uni t standard deviat io«» and therefore the value of t k significant at the 
1",, level, It may lie ntded that in this particular case the exact test gives less 
signifu-iinre than Finney’s. 

For eomparison with the casts of uncorrelated normal variables we may note 
that if /} ;a ti, 

w+u 


has a “ 1), |{n 1)| distrihution, and therefore 

(tn-l- w)* 

hiut a lf{|, |(jt ~ 1)) distribution. As shown above, for any value of p, 



(■«;+> w)'*~4r’how 

has a if||, l[n~~2}} distribution. 

The same method gives very easily* the distribution required for the estima- 
tion of p when the stfindard deviations of x and y are the same, ctj = o-j = cr. 


hence 


u = x+y, v = x-y 

a» then ind©[)endent normal variables with variances 2cr*(l+p), 2cr®(l“p),: 

Ml+p)’ 2(r»{]-p‘) 
are independetit chance variables each distributed like y® with n - 1 degrees of 
freedom, and so the distribution of their quotient 

1-p 

w -'-'—v ♦ ■«>***» a 

" l+p 

is known; wjil +w) has a 1), |(n- 1)} distribution. The expression for 

w is 1-p 1+r 

l+p 1— r’ 

2£(x,-x) iy,~y} 


w ' 


where 




♦ Cf. Do Lury (1938), where the same problem is dealt with by a different method. 
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If X and 1/ luive thn .same niwui vahn* n us well ,’n* th>‘ hUn‘litn 
the mean value nf r k 0 , am! therefore 

2fr®( 1 ■ /d 

is distributed like / with n nf fm-thiiii. ‘rhuff Jti*' diMribu 

quotient, I »3^ 


I 


r,. 2 


isknown; M»/(l + W") huH a |/i| dmlributims Tliv e sti-rnMi-.n ‘a 

I "P 1 f ■ r 
m « . . ■ . 

1 fff 1 ‘ r 

'iiV,- In, ■■ |3 

The procedure in this cuHt* eorresjamds exavtly to tlip I'j Ifiitrf'!, 

correlation by the ana!yHl.H of variance. 


where 


r Hi 


nmmm'KH 

Finnky, D. J. (1938). IPmftriU M, Hw 2, 

Djs Luhy. I>. B. (193H). .Iwi. iUtlh. SmM. % t t« fd. 



TKST Ff'Hl TfIK KI{,IXIKK‘AN('E OK THK DIFKERMCE 
IIKTWKKN TilK TWO VAHJANOKH IN A .SAMPLE FROM 
A NORMAL BIVARIATE POPULAHOH 

Itv W. A. MOliCJAN 
' fl/ I'nim-nitj (hlkge., hin4m 

I. PEtllVATtOX OF KATKt TEST 


I'i ,1 i«i|»er p«l*ii»lied in a rewnt iwue of thiti Journal 1). -f. Finney (1038) con- 
jtuli-wl tin* follrm ing t|nwtio!ffl. A ganijile of « jiairw of varifiblen (jc, y) has been 
*lraw'fi from the bivB«»l« normal dislrilnttion wJumo pr‘d)ability law ia 


F.r. yl - ^ exp 


I j 


- 3(1 -pla)!, 

1 cr, 1 


.. (£ “^i){.V - bL) , p/"SaF 

“Pit ^ 

(T^ (Tg \ (T. 


|i| Wbiit is Urn |»roliability law of the ratio, w, w 


ft) 




( 1 ) 


(2) 


and 


i i 


-~1)? 


(ii) („'ould this ratio bn usefl as a crik'rion to test the hyjiothews that n’t ® iTjI 

tlaing ft more direct method ho was first able to confirm a previous result 

of How (Rblf)), giving the proliahility distribution of « in the case where ffi = cr^ 

and hem» to show that the chant* that w excotals a given value, say Q, could be 

obteiried from the Tahks of th Inccmipkk IMa Punciim (1934), using the 

wltttion in^i n..~i\ 

P{ft)>W| s 41— (4) 

1 /. 


whew 




(S) 


Since the probability expression (4) is dependent on the population eorre- 
lation /)i«. whitth will be in general unknown, Finney pointed out that the ratio to 
was not altogetlier a satisfactory criterion to use in testing the hypothesis tr j « cr^, 
but he put forward a possible method of getting round tiiis difficulty. 

The question arises os to whether there ia not some other more suitable 
criterion for testing whether the varianoes are equal, whose sampling distri- 
bution will he independent of and of any other parameters whose values are 
not specified by the hypothesis itself. The likelihood ratio method of approach 
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of Neyman and Fearaon may ,4 

in a number of instatmes where the appropristr* s-nirn-ui iini wi|r«r4i.-ilr|% 

obvious. Smumarked ftriefiy, it iuvolvm the fol!«i^»wg 

(a) A specification of the set, of admiatibie by|«4hfw In tlw* |'r»nl eme 
these will be defined by the joint [tndmfntity Jaw of tbr ?? of .dwr* 


p{Xi) Vi k< ^1* 

“2(1 ....py il rr, I " 

I fTj / O'! (T^ih ' 


i 5 f 1 1# til 

rfjtfj 


where " oo^SCt. -f:. - J ’.ptfi’h t 

In this expression 2 and y are tlie statn[)lr mean*, fj* fho r.iirrrkti«ri 

coefficient, and 

«! « !:(*■< - 4 ’"•“ iji.Vii ■ ' yl* « ■ t " 'f 

i i 

( 6 ) A determination of thome valuea of the fivi' unknown 
functions of the observations, which jointly inaximir^* the ir>spnw«irus ’{'lip 
solution is known to be obtained wlien 


““ *1 ^8 ® Hi Ph ^iS' 1^'/ 

The maximum value of (ti) is thus 


Pifnmx) * ~ rf , fS| 

(c) A specification of the hypotheahs teted. This hy{K»thc*i* a^tsmw that lire 
probability law is of the form 

p{Xi, yj j|j, o’, pjjl, 

where the function is obtained by putting a-^xcr^m or in Cd|. 

(d) A determination of the values of the four unknown jjarametew I,, <r, 

andpij which maximize this expression. These value* may fa! sliown in Im 

Ss “ Hi O’ s= + 8|)j, * 2r,gaj%fit| i »|) | lt| 


The maximum value of the probability function defined in |r| then 

Pa(max) « {ew ,/[(«! +«|)» - 4f|g8|s|j| «. 1 1 H 

(e) The likelihood ratio criterion is then 


j_pj(max) ( 4s|«|(j~fy F* 

" p,(max) * \(4T^Irf,4'4) 


(IS«| 


_L (sf-sD* y-» 

” I (^!+sl)®--44«|#|! * — 

(/)■ The hypothesis tested becomes less and less likely m A moves frtim I to ch 
To complete the test it is necessary to know the sampling distribution of 4, or 
of a smgle valued function of 4, when the hypothesis t^ted » true. 

It will be seen at once that the test differs ffom linmy ‘s beca» the mtorfon. 
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unltkf lii» bh w • f«rwf icitt tif m well m and Tlie meaning of the criterion 
wliirh hm Iwfu fiirked nut hy the A-ntetlnKi Incomes dear if we make the fol- 


lowing f rarwhirnmtinn of the orijgitinl variahlea: 

Write a:-X+lh (13) 

»o that X « |(jr'f,v), Ysslix-y], (14) 


Then the fwjmktion varianoM of x and ij may he expresaod aa functions of the 
v«,riw«» Mcl correlation for .¥ and K, tu? follows: 


irj a erl + <rf- + ''^pxy(^x<^r 


, , j , , (16) 

flr| » + erf, - 2pi^ r o',Y<rr 

The newsaary and atiffident condition tlmt the hypothesis tested is true, or that 


ff). » is that 


Pxr - 


.(16) 


vHinw .¥ and Y are normally correlated variables, the apjiropriate criterion to 
{«t the hyfKJthwft, pxr * h. is the samfile correlation coefiioient between the 
tranuformed variables, i.e. ry,,. If the hypothesis is true, this coefficient has the 
well-known probability law 

pi^xr^Pxv * b) w constnntx (I ™r|j.)*'"' *>, 

Making use of the transformations (14), it is found that 


.(17) 




sf- 


....(18) 


{(4+.|)»-44s?.v|}»- 

Henoe the Ukolihcxid criterion of (126) is seen to be 

A»{l-r|,p, 

and as the hypothesis tested becomes less and less likely, A ->■ 0 or r|y 
test may therefore be carried out by (a) referring the r;^ y of ( 1 8) to the probability 
distribution (17), or (6) alternatively referring 

( 20 ) 

V(l~^.'rr) 

^,to “Student’s" distribution with degrees of freedom /= n - 2, and (c) rejecting 
lib hyimthesis when 1 1 or ) 1 1 fall beyond the desired {irobability level. The 

bast, it will be seen, is independent of the unknown coiTelation Pu between 
« and y. 

2. Thb power op tuk msT 


...(19) 
->•1. The 


While the probability distribution of r^yj- is independent of if the hypo- 
thesis (ffi = Oa) is true, the chance that in using the rejeotion rule of the test 
we shall detect real differences between and (Tj will depend on the value of 
PiP Suppose that we fix a probability level r{a) such that 


a = 


Ka) 


c(l --r®)i*'““'‘’dr,* 


( 21 ) 


where a, for example, equals 0>06. 

* 0 is fcbft oonstant of Uio probabhity kw (17). 
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TAft o/ f/u' [Hfft rrnr,- hfm ni I 


f| r '5 fi n*'*' ‘ 


I'kn the c.lumce, of rejer-tiiig tl!!* hyi^tf I iomk Siwt 

fTs.rrj 1 , 


when}' »= ’t- 1. w jjiveu hy ttu* cx^uvhhuiU 

This expresHion Neyiiian .t iVnwon hav<- tfrissc^l s.hr <4 tlw- u-mi 

of the hypolhofik that y (Tj.-’ffj s. l with wmd to nn .uiirrs:*’*!'-*' 
y ==yi. hvthe (>x]ux'i<«iuii {24^ l>i^\P ■'■ ih-inf^w fhr p-nvtal lipd^nhditX' h»' 
for r in Hariipletj from a hivarialr in»rn)sl jKifiij!(iJi<'Si, hr#? ohimtir-d tn ft ^ 
Kiaher (Utlf)). It will Im that, a n'latiMU i- So. M r l*-'Sw*«#'if, 

Pxi'' y “ Pii' 

_ y '~''yj , , . 1/2 i'f: 

' ■“ |(y ^ y ») 1.4 .J|l . pfjlft 

Owing to the Kymmctry of tlm tliatrihuthtn of r when p - ■ «». the «! tim 
will he the Kaine for filt{)rn!i(ivt\«p_,j.|- ami ■'■p.w • h *61 fheri'lorr 1*" the MUse !<»r 
alternativefi y imcl y For example, the Iwt i« aa likely te» i-rjer! the Ijy 
( 0*1 ss oTj) when rrt » as when fr, m ItTg/rShaisrirarly wisai we i».h«»w!*l r-iip*'* I, 
aa ffi and Vg are in no way vUffcrrmtiaUal. 

In ITig. 1 1 have sluiwu the power finu-thm of the ti»t j-akhiji a -• ‘it !«* md 
sample swe n « 25, for alternatives y> 1, in the thrw mmv -* n, pu A, 
and pi^ = 0-8. It will he noticed that the teat in mort? jKmerinl whru » liirp'*'* 
This of course follows from (24), sine® for a given value of y, p ,.j. « »il l.«‘ fnrtlter 
from Kero the nearer ] Pu [ is to unity. 

The computations were made with the help of F. H. P»vjr!‘» rrstmlli’ |ialitito|«*«l 
TaUes (1938), Tlie work was simplified by taking at wjnverM!*}ii v#|«««4 
O'l, 0'2, ..., 0-9, and finding the oomsprmding valuta of y by ?«««§ nf J24I 
The table on p. 18 shows, in the columna heatl«l Tt«i f«|. tin? iht? 

power function computed in this way for this erws of «. * U and *.{*.? f«? « « i | 
and n w 100, 


3. CoMfAmsoH wmi Finniy’s test m to® iuhk wrwi® 

In the case where Pi j is not known, 'Finney him suggwtdd tlmiJii# ie*t mkmm 
(s might he used by making a double apprml to signiftcMitje tevik on lit# lirwii 
proposed in another case by HirsehMd (1931 ) , It dw not, ho*w w, •pimi' fsay 
to determine numeiioaUy the power of the resulUng tn tl» «i« wli«» pj, 
is known, it may be shown that the likelihood ratio csriteiion now tia<so»«« 


A- 


1 , j ""* 

25 i« 2 (l-rijpij)) 


I '4' 


(««• If 


2 w(l 


"htPn 

* This means that the hypothesis is to be rejected when rjey 
either tail of the distribution (17), or for the me n « 2S, when | fjjy j > fo 


l»|wl li» S% k»l l» 
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Thi# IS f!ir 4 mMy a fiinfiion „f hlmwy'K criUmn «/, since ifc deijcnds agsin on 
liip Mirii4»» r,j Tfir. furm nf this axpmwion is interesting, as it shows 

tl«f f# aiici fhffrrs from s»rt». in view of the eorrolation which exists 
t»f!«ppn j'j. ,!s^_ ntui » ij,p values of all three' are relevant in examining 

f hp tirti *rj « I have Hot Huc.JXH*ci«l in determining the sampling 

slfiirilfsitfttR of flip A of or of any single-valued function of A. 



In the case of pi^ known, it is, however, possible to compare Finney’s test 
{involving pul with the likelihood test appropriate in the case where is 
unknown, hut still of eotiiw applicable when it is known. The power function for 
Finney^a test may ho eomputed os follows. 

When using this lest the hyjjothesjs that y » 1 is rejected if or if 
ft) < .0' *, where S to a constant chosen by using relations (4) and (6) so that 

’M 

p{ii) I V » l)d(s ss ^a. (26) 

„ a 

'The power of the test with regard to an alternative hypothesis, y + 1, is therefore 

1*11 ' f (0 

given by F(y) = 1 - < w < f?} *= p[u)\y)do)+\ p(w \y)dw, 

Jo J u 

(27) 

where p(w ) y) « constant x (1 _pf, )!(«--« ~ j - 4pjgj . 

......(28) 


Blttittetrik* JtsKi 


• Sm K, Psaraon (1913). 
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The above integrals may be transfiimal to Hivo 

- 1 H - t t .«■!*’< *1 ;('!*! 

^’(yl = ’ *2 ) '* 2 " s ' 

where :=« |[1 -H'yii ■“>'■“ */»* ’I \ 115'“'^' y l•]r''s'l^ 

and ‘K'lr/ ‘i*'’ 

so that values of flie power fmu-titin ntay In* eslrsihu^i ttMni,: !hp »'/ 

Oompamon of pnm fundutn-a of C.w| lAl'dfh‘'f’i‘l k*f ihsM-l on ? ^ j- >■, 

(i) Finnep'/i led (hiow/l nn >01 u-lifn 9 ..« kmnm 
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himmpkif Bfki Faiirimn. 'rht,*»e v-nhi{?K are (’ornjiared in the fca})!e on p. 18 
slwvr wjf h fiifw* for tl» likelihcKKl ratio teat. It will be seen that: 

1 1 1 WIiTO » f>, i.c. when the two variable arc known to be independent, 
thr |»f ftiiwd cm (» or the ratio of aaniple varianeea ia tlie bettor. This ia a result 
alrwidy known, Imt the table kIiowh how sinall ia tiie difference between 
flir fwf'p., 

12 1 WliPii />,a is O-S or O-K. however, the likelihood ratio test ia somewhat more 
ftrrwifivt* for the srrirdl dejmrturea in y from unity, and less sensitive for large 
df|*arl«r« than the w test. Practically, this means that when pi^ is known, the 
w f»t if« slightly the better, sincse we are most interested in situations where the 
t'hnmv of detection of a diff'erenoe becomes large, say greater than 0'5 at any 
rate. If is lawiblc that a test based on the criterion given in equation (26), if it 
could be obtained, wouki be more powerful than either of the other two tests 
when is known. In practical cases, however, it wdll nearly always happen that 
f>js is unknown, and in such cases the of (18) apjiears the appropriate test 
criterion to use. 
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THE DISTRIBUTION OF RANOE IN SAMl'LUS FlfOAl A 
NOEMAD POPULATION. EXPKKSKKU IN OF AN 

INDEPENDENT ESTIMATE OF STANOAliO DEVIATION 

By D. NKWMAS 

Ikpirtmeni of Siafutim, Vmmmh^ hmirn 


1. I'STttnurrTin^: 


Smiratt froffl the ^-wiltihutbu of TipjcH lUrifci. # s¥»ntj.|pr4i»F ,atp,.,ssnf r.! 
computational work lia« hwu rarriwl «Hit in tvff'irt' Vf^rw wiil'i «<«' *4 

making poftsiihle the use of range, i.e. the dwtanw Ivt wmn f he |{tglirf4 «s>-l 
observation, when dealing with mtnjflw from » norm«$l Tlius' 1 '» 

tables of the mean range exftroKiSjd in ternis of the jw'»pn!jii»««ri «iao«ii»ri| -'Ir vjiis noi. 


ff, for sample sizea n. a: 2 to Iniiu have Imm repidjlishwi ju 
and Bimdriciann, Part 11, Tahln XXII {K, IVarami, lh3!| F f** IVnifkrtfi 

(1932) gave a table containing the tttftiidard deviatnm of ass«l tfs«* 

approximate upper and lower U), f), I atid prohmlulny UvA l«f wMipIr* 
sizes w » 2 to 100, again in terms of rr iw utiit, 1st iloing ths^ hw wwi pfnpwtml 
Pearson-typo curves with airrcet moinents, ansi ehtwkwl Ins* rwiilt# ♦wntn*'* 
experimental sampling distrilmtiona. 

If a number of small sasnplea are available, it ha* Sw’vn rfutan iliai « wps*! 
estimate of cr may be obtained from the mean value tsf the range, whtrli t« osdy 
slightly less accurate than the estimate obtained frtmi the sums of » jijar««. Ag#»i». 
owing to the high correlation betweesi range and atandartl ih?vi»si«jn in a Mwpfp 
of size 10 or less, it was pointed out by Peaiaon & Haittftt ( ld3r»| ilia! niijg** n»Y 
be usefully substituted for standard deviation in wintml rhart* 
changes in the variation of quality in industry. In sdl tb««‘ liowrf rr, tlip 
basic sampling distributioir used has been that of th« ri#ng** fit <‘r. 

Not very long ago "Student” (the late Mr W. S, tbw^tl Prof, 


E. S. Pearson that it might bo useful to know more alwut the «w phng dwifiliui ism 
of the ratio 


where w is the range in a sample of n observations from a normal fiifmlat km with 
standard deviation d, and a* is an independmt and «nbia»«l ul a* 

based on/ degrees of freedom, obtained from a sum of squaw In tlii wnml rnwwipr. 
The type of problem wMoh “Student” had in mind w one in which, «« * rwolt 
of an experiment, a number of 'treatment’ means, (ay fj.fj, are, avaUabli, 

and also an independent estimate, s*, of their sampling wkaw. Then a ripkl 
method^ of judging whether any treatment differenees exist would wiiakt in 
comparing the difference between the highest and lowest ttmtoent mmia, my 
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w * "Xj, with g. Should thift difference he clearly significant, having regard 
to the Vftlup« fff n and/, the iiutre divergent of the extremes, cfjiild bo set 
wide, and thediffcnmtwx,, - X 3 tt)mf>aml with g, using n- 1 and /. This procedure 
would in feci he similar to that suggealnsl by “Student." himself in his pa]>er on 
“Erron in routine analyais’' (19*27, pp. Ifil -2), excojit that the ratio used would 
ncm’ I» mk mther than ir/m. Of courae the jirobahility levels for the former will 
lend to thf»e of tljc tatter ratio as /~^<oci. 

In the follow’ing sections I shall first make use of the results obtained by 
Prof, Piamon in computing jirobahility levels for u'f(r, to determine appropriate 
levels for then illustrate “.Student’s" suggestion on three practical 

examples. It should he noted that in a ro<xmt pajier H. 0. Hartley (1938) has 
suggested a systematic method of ohteining jiroliahility levels for “studentized" 
functions. It is hojied that before long fuller and more accurate tables may be 
avaihihle to su|ti)h*mcut tlie jiresent tables which rest to some extent on an 
empirical basis. 


2. Tkk kxi'kotatiox of (7 = w/s 

Before deserihing the method of fiuadraturc by which the jirobahility levels 

were obtained, it may be useful to give a table from which the expectation of q 

/+! 

for various values of n and / can lie calculated. Since s® = 

i»*> 1 

for the probability distribution of s, 

'o(s) ss jj ) 

If we w'rite p{w) for the jirobahility distribution of range, a function whose precise 
value is only known for the oases » = 2, 3, and denote an expectation by the 
symbol S, we have 


rw 

%)» p(q)qdq 

JO 


m~^p{w)p{s)dwd3, since w and s are indejiendent, 


n oo 

> wp{'w]dwx 
Jo 

Jo . 


s~~^p{d)d8 


= E{w) 

Since the values of iS(w/(r) for changing % have been tabled by Tijrpett, it is only 
necessary to consider the integral 
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Hence it follows that 

E{q) = Ei^U'Ja) x \’( |/) !'{ lif- 1)| /'! |/^ ■ ? 

A brief table of the Heeoiul fnnc;f,ion is j^iven Iwhtw; the valnr# trf hiw- f? i sn.iy Iw' 
obtained from Tippett (ITir.), pp. UHh ", tw from fahkg jnt mui 

Biomeiriciam, Part II, Table XXIl. 

TAIU.K 1 

Fadim % ivhick to muUijily K{u\tr) to rMoin hlq) 

Degrees of freedom 3 <*5 7 D't i!** St '* 

Factor 1’3H2 MH9 M2i? l M'2d I '«»* 


3. COMimTATION OF I’AULB OF ft AMi 


I'th SUJN'ItlfANit'K fuK y 


U"’ ?* 


The problem is to determine, for different values idh* and/, vidtu?* ?•«* !! llotl 






-riV, 

du\ 


0 


{3| 


where a » 0*08 and O-Ol. Since tlus distribution of q will clearly 5m mdcjmitdcni 
of the population standard deviation <r, we may take cr as unity in tfm juoiKilnhty 
functions used in (3). Except in the cases n • ^ ‘i, 3, wlti'h will l>c nifm'c*! Ut Hgain 
below, the procedure adopted was as follows; 

(а) The ordinates of tire cmpiriml curves, »ay oblaimui Ijy K, Jv, i*wir»nt 

(1932) for the ctiaes n = 4, 6 , 10, and 20 were used in jdace of the imkn«nvn ppr j , 
These ordinates had been calculated at equal intervals of (h I for «», the jaqiuiaiinii 
standard deviation being the unit. 

(б) Taking a trial value of q^, the integrals J(k?, qj wm mim- 

Jn 

lated, with the help of the Tables of the Inampkk (lammi Funtlkm |K, IVuremi, 
1922), for each value of w used in (a). Thus J{w,qj « l{u,p\ in the nwlutiun erf 
the tables, where 

(c) It was then necessary to apply quadrature to the preKlmis ijM « Jye.qj, 
calculated at intervals 0-8 for w, through as much of the range w 0 to «? » a; 
as was necessary to obtain the required degree of accuracy. 

(d) The resulting expression would of course not cotreapond to a emt-ily. 
as q^ had been guessed. Other trial values were taken for and the final value 
corresponding to a = 0-06 or O-Ol obtained fay backward mterpctlaticm. Starting 
with the case w = 2, where the exact value of q^ could be obtained for ail /, and 
knowing for a> 2 the Umits to which tended oo, this proeswa of trial and 
error was not found too laborious, 
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mm wlten n » 2 

In lliif« virni, taking « 1, the distribution of ui&mmnea thesimide form 


pUr) ! c ' *"* for 0 !$ «! < 00 . 

\^ir 

HentJi* the joint jirobahilitv dwtrihution of w and s is 

Transforining to variables q « le/s and «, aince 

|8(a»,s)j _ 


(5) 


m 


we ijbtain 


(7) 

( 8 ) 


Hq,s) 

p{(ps ) « constant 

Now integrating for a Imtwoen the iimits 0 and co, we obtain for the probability 
distribution of r/ 


p{q ) « 


yi, 




for (/5>0. 


(i») 


i/rr/ im \ ■"?./ 

Tliis corresjmnds to the positive half of a “Student ’’ distribution having/ degrees 

!*•!» 

of frwloin. Values of satisfying the relation a = lJ{q)dq may tlierefore be 

J (/a 

obtained from li, A. Fisher’s (193H) tables of the percentage points for 1. Thus 

= (10) 

where X will be respectively the 5 and 1 % levels for t.* 


Special case when n = 3 

For this case McKay & Pearson (1933) have given an expression for the true 
distribution of w. The quadrature method employed when n>S was again used, 
but tlie true values of j.){w) were taken, and not the ordinates of the empirical 
curve upon which E. S. Pearson (1932) based his original tables of percentage 
limits for w. 

The following table shows the framework of values for and obtained 
as lias been deacsribed. Values fnr/«oo were of course already known, and 
values for n = 2 and 3 are exact. 

li’rom irable 11, the more complete working Tables HI and IV were obtained 
by interpolation. It was found tliat the changes in ))ereentage levels, both with 
iiujreasing n and /, ran most smoothly if the arguments HOfn and 00// were used 
in place of a and /. On this basis, interpolation from the framework values was 
effected, using five and six-point Lagrangian formulae. Various checks were 

* These levels eorrespoiid to deviatiooa at which the ordinates cut off 2-5 and 0*5 % from each 
end of the i-distribution, but they are termed by Fisher the 6 and 1 % lerels. 
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TABLE II 

Framework values for and g^j 





6% iMJinlfl 





1 

‘ yO 

jwintu 



/ \ 

2 

3 

4 

6 

lU 

20 

2 

3 

4 

S 

W 

at 

6 

3-04 

4-60 

5-22 

6-03 

1-m 

8-21 

5-7(> 

m 

7-83 


W Ift 

li 15 

10 

3-16 

3-88 

4-34 

4-92 

6-60 

e -47 

4-4H 

,5-27 

,5-77 

6-42 

?‘2l 

8-22 

20 

2-95 

3-, 58 

3-97 

4-45 

6-01 

,5-71 

402 

4-»3 

601 



8-82 

30 

2-8B 

3-49 

3-80 

4-31 

4-82 

6-47 

3-S9 

4-46 

4-7H 

f.-23 

,57.5 

fl4<) 

00 

2-77 

3-31 

3-60 

4-04 

4-4H 

6-(Jl 

3-64 

4-12 

4 -38 

4-74 

.5-1.5 

m 


carried out, as for example that of comparing the valuw obtiuneci by thin tMethwl 
with the known true values in the case of n. =5 2. Finally, the figurw were mluml 
to one place of decimals, and these are given in Tables III and IV, 


4, Illustrative kxamvi.es 

In the following examples the range test is used ns an alternative to the s teat; 
the latter is, on theoretical grounds, the more efficient of the twf» in the mnm 
that it is the more likely to detect tlm preHcnw of real diffcrwiew if tliny exist. 
Both “Student” and L. H. C. Tipjwtt have, however, held lhatsituatians may 
be met, particularly when dealing with industrial problems, where the gain in 
speed following the use of range justifies the relatively small Itws it; cRidency. 
No doubt other types of examples besides those illustrated below will tmcur to 
the reader, 

Example. A 

Fisher (1937, p, 93) has given the results of a B x 6 Latin square experiment in 
which six . different fertilizer treatments were applied to a crop of pot.»t<»8, 

Denoting these treatments by the letters A,B F, the miwui yields pet plot 

in lb. were as follows; 

A S C D E P 
346'0 4 , 26’6 477-8 406-2 620-2 601 ■» 

The analysis of variance table is as follows: 


Sources of 
variation 

Degrees of 
freedom 

Mm.!) 

squares 

Rows 

Columns 

Treatments 

Error 

6 

6 

6 

20 

10,839-72 

4,803-46 

48,836-08 

1,627-06 




TABLE IV 


I % foi-nts for q » wis 
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To test the Bignificanei! of troiitiiient diffoK'ncoji tis a whnh* 'vr tnsd ; l ■ , 4M , , 
while fordogimsoffrcedoni/j - 5, A - lh>. II. A. Fish.*r‘Mah!e‘*s.'«v.- 
There are clearly, tliereforc, .Hignilimiit trcfttincnt fhtf«*rcHci*f* inTw-ist fhr hi.. 
dividual treatment means havi' hmm plottefl in ihe lignre, iiud 

the tabled significance levels for »/ v\i^ may 1«‘ nw'd, with diwrelioiL w a loot- 
rule in investigating the situation. 




Table IV above gives for ft = G,/ =. 20 a 1 % level for q of S-S, oonftrmltig the very 
significant scatter of treatment means brought out by the s-test. 

We may now ask whether, if we were to exclude the most divergent treatment 
F, there is evidence of a significant difference amongthe remaining five treatments. 
We find that w = 620'2 - 345-0 = 175'2, and, using the same estimate of standard 
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prrjsr, s -• Ifrfl.'i, fiiifi q - I l-ft. This valiHj is Ktill well liByond == 5’3, tiie 
nhfiiifiptl frr'irri Tahir* IV u-itli 71 fl,/ - 2!). 

^lakifiK- «ii(Tfwivp trials wn fmij fhaf ; 

1*1 I hmltniK /I anti F, sin* range nf ilin four troatmentw B. C, I), and E w 
»till stgiiifiwH'jt', «tnri‘ q ... iiij-O ifi-itil 7'2, while fur n 4, / :■= 20 we have 

*1»m ® ‘*'*'*' 

fii| A, A'. »nfi F, Ihe range of the thr«* trefttnienta B, C\ and 1) 

i» ttignifiranf . at the 5 % htjt Just not Mignifieajit at the 1 level, For in this ewe 
i| « 72-fi;IS-Pi"* « 4-3. while for n a* 3,/ 20 we iiiid = 3'fl and q^.Di = 4'8. 

liii) I'hs the otiicr hand, if we divided the six treatments into two groups, one 
l ojiwiMtirig of .4, II B, and the other of f E, F, we find that the value of r/ in 
both groU|*s fails beyond “• 4-ti. 

We are thercTort* It^d to s-f»ti*'hirh‘ that the high value of 2 obtainoil from the 
futtjprt'heit^ive test cHniutt he i*xplaijmd hy oiu;, rjr even two, treatments differing 
from the tdhers. it is duulitfiil, even, wlusther any three treatments out of the 
six laitihl !.«• regarded uh f orming a hoinugeneouH group. 

Two final iHiints shcmlil he nnUMl. In the tirst jilaee., after emitting Bucooaaive 
t mat nieiita regarded as divergent , analysis of varianee preersdure could be apjilied 
to twt for significant tHiferenees among tlie. remaining treatmomtu. 'I'he calculation 
would, however, not be as quick as that involved in the Huccessive trials (i), (ii), 
and fiii} above, using q. Finally, as mentiom»d above, the method followed, 
whether z or q is used, must be employed with discretion, as is always the case 
when ol,»rvatinn« are rejected and a hypothesis tested using the selected data 
that remain. 

Example B 

A similar example has been taken from Snedocor (1037, p. 214). The following 
are stnmn treatment means expressed in terms of bushels per acre, obtained from 
a 7 X 7 Latin square experiment, again oornparing the effect of different fer- 
tilheers on potatoes. 

A B a D E F a 

34b8« 303-14 3(J0'57 300-48 379-80 388-29 387-14 

The appropriate estimate of the standard deviation of a mean of seven plots 
calculated from the error sum of squares of the analysis of variance table is 
s = 0-52, Testing the significance of treatment differences as a wlufie, it is found 
that z 0-5S74, while for degrees of freedom /j = (5, /j = 30, Sg.gg = 0-4420 and 
S(,,oj =! 0-(5226. The ratio q «>/« for all seven treatments is 4r»-3/9-52 = 4-7 a 
value lying between = 4-15 and f/0.01 = 1)’4 (entering Tables III and IV with 
)i = 7, / = 30). Thus using either test we should conclude that there were prob- 
ably significant treatment differences. 

The seven treatment means have been plotted in the lower half of the figure. 
Tliere is a 8ugge.Htion that either if (i) treatments F and Q or (ii) treatment A 
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were regarded ae exceptional, the remaining treatments would form j» \vmiu- 
geneous group. 

This is coufirmetl on inveHtigatinn; 

(i) Removing F and (I, we find g dH-ti.if-.Ti d-P, wltieh is just }wdr« the 

5 % level = 4-1) for « - n, / :i. 

(ii) Removing A, we find g ‘id-Tdt’dg 'i-H, whirh i« well Uelnw llsr n‘'„ 

level (go.o6 = 4-3) for n = 6,/™ 3. The more htmmgenwnw gtnup t»* k* 

left in the second case, i.e. on removing -4 alone. Heyoml thisiudlmtion rawnd 
go, as it would need a knowledge of the ehararter id tlm treatments to draw mtita 
deftnite conclusiunH. 

Exaiuplt’ (' 

In eases where a nnnilier of dnpHriite tihserviUimis an* availahin, an I’stimntn 
of variability may he obtained rapidly from antnining llie siptafvf* »4 fitr diller 
ences between pairs. Tims a- 

and lias k degrees of froedutn. We may now eoinpare the riuige iti the mean* of 
pairs with s, in order to determine wlmlher thea* is too mtieli variation khwm'n 
pairs having regard to the variation within pairs. It may is* imiwl that an 
estimate of <r may lie obtained even more rapidly by ealeuiating tlir niimn 
in pairs and multiplying by O'HHfl’i,* so that 

k 

t-i 

but since the sampling distribution of (s')® ia not that of y®. we eaniml justijialily 
take q = wjF and refer to the tables of percentage limits I have givtm. 


Ddeminaliorm of pmenlage film 


Analyst 


■ ' ■■■■ 





-* 


A 

D 

V 

D 


F 

15 

1} 

1 1 

ist determination (kh) 

12’llft 

12'ni 

12'32 

ta-ifi 

12-73 

12-4S 

1130 * 

12t4 


2nd determination (a:,,) 

12'47 

12'fi2 


l2'St3 

12-43 

12-40 

1273 

12-03 

1 

i2'«i 

Difference {d^) 

oao 

-O'll 

-n'23 

0-22 

0-30 

(1-112 

-.(H3 

O-ll 

.. -*)i 

nnt 1 

Sum 

25'13 

25'13 

24'87 

26'08 

26-18 

1 

1 

25-(j3 

24-I7 


* The reciprocal of 
ld2838cr. • 

1 '12838, since the expectation 

of mil 

ge in a 

sample 

of two 

Iralivk 
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Thf* fsliinvii Mil 2H alvNVP mnwst of tf*!! fiuplis'atf* (ii‘terininatif«iHof the 
ffprrrjifacr filirf* m »"»rf>fnlly niisffi i»afnplw takwi from flie Kupjily of Hoya 
I f 'filip, l•.aril pair of YitUics Wing ofttainnd hy one annlyat.* Tmi differfiiit 
Bjwlypt# ftprr* rorirmifd, I'Ih' problem is bi clelermine wliether Hiprc few obRemi- 
lion* provnir* any I'vnieiire of systenintii' differpnetw in teeluiiiine between the 
f!naly#f«. 

The full aswlyws r>f varinmi* is m follows; 



■ 

Hum of sqtmms 

n.F. 

Jfrtm wpinit' 

Wifhiie (airs 

IMIH.V) 

m 

(toil 1 45 

Ik’lwwn j'wi!w 

l-2.'>204.’i 

f) 

)ld,W227 

TmSa! 

idims.’i 

m 



, 


, , , , . 


'IVsIt ing for differeiK’es be( wemi ntirtlysts. we obtain ; -a f)‘(1475, a value falling 
Iwbween the 1 % ib'7tt!l!b and 5 ((>'5527) 

I'aing Ibn range inetluMl, we note that the cKtimnte of the variance of the 
«mi of two doterminatioiw i« 

Ik ] la 

s/«I lb/-.! 

giving an ealimate of the atandarti error of a sum of 'Fbe range in the 

ten sums in the table is u? « 2(i'(tB“-24'l7 « Id) I, ho thatg * ld)l/0'2Bfi8 » d-flC! 
For n « Itb / * 10, Tables HI and IV show a 5-0, = 7*2. Thus the 

differencie is siguifkant at the 5 % level, a result similar to that found using the 
2-tefc. 

If now we omit determinations of analyst I) (wiio gave the highest readings) 
w© find g * {2S*88 “• 24- 1 7 )/0"2H68 s= (i-3 1 , and is still significant at the 5 % level, 
sinoft for » » 9, / » 10, g^^jj = 5'3, If, however, we omit the determinations of 
analyst H (who gave the lowest readings) we find g » (2fi'08 - 24-87)/()-2H08 = 4*22, 
and is no longer signifioant. Wo should conelurio, therefore, that except possibly 
in the ease of analyst H, there is no evidence on the data available of systematic, 
differences in technique. 

I should like to take this opportunity of thanking Prof. E. S. I^arson not only 
for suggesting the problem to me, but also for his kind help in putting this paper 
together. 

* The figures have been taken from more extensive data made available through the kindness 
of Dr F. Toflher. 

t The corresponding estimate obtained, as described on p. 28 above, from the moan difference 
between pairs is 0'291. 
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ON A COMPRBHENSTVE TEST FOR THE HOMO- 
GEN El'I’y OF VAIH ANCLES AND COVARIANCES 
IN MULTIVARIATE PROBLEMS 

By D. J. bishop 

IkiKirlment nf iStatiMm, IJnwenity Collage, London 
I. iNTROinrOTION 

Now* tliat satiBfaijttiry and probably final aolutinns have been obtained for a 
wide variety of gtatistical ])robleinH concerned vdtb a single normally distributed 
varialtle, more and more attention has recently been given to the solution of 
niultivariate problems. 'I'he multiple correlation methods of the old largo sample 
theory have been rtiplaced in many in.stan(!e.s by others for which “studentized” 
test criteria arfi available, often having aanpiling distributions that arc already 
familiar in univariate problems. In a recent paper on “The statistical utilization 
of mullijde measurements ”, R. A. Fisher (lihlHu) has shown the connexion 
between certain of these methods: the D^-statistic work of Mahalanobis, the 
discriminant function methods of the Galton Laboratory and the generalized 
“Htudent’a’’ ratio of Hotelling. A similar very general problem was dealt with 
some time ago by S. S. Wilks (1932), while mention may also be made of two 
papers by 1). G. Lewley (1938a, h) and a paper by P. L. Hsu (1938). The purpose 
of the methods put forward is to obtain information regarding the mean values 
of a number, wiy q, of correlated variables in one or more, say k, populations 
from which random samples have boon drawm. If wo denote by a value of the 
«th variable (« = 1,2, .... g), then in all this work it has been assumed not only 
that x„ is normally distributed, but that it has the same variance crl in every 
population sampled, Further, it is assumed that if x„ is a second variable the 
correlation coefficients between and jr„ is the same in all populations, The 
estimates of variance and covariance required in order to “studentize’* the 
function of the sample means are therefore obtmned by pi)oling together the sums 
of squares and sums of products from all samples. While it is true that even if 
(T, and are not the same in all populations the error involved may not be very 
large, it is however important to have available some means of testing the basic 
hypothesis which assumes homogeneity throughout the populations, 

Buell a test has been derived by S. B. Wilks ( 1932) by an extension of Neyman 
& Pearson’s likelihood ratio method of approach. Hitherto the somewhat lengthy 
computations required to obtain the moments of the sampling distribution of 
the test criterion have probably discouraged its use. The objects of the present 
paper are as follows: 

(a) In the simple but commonly met case, where the k samples are of the same 
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size, to put Wilk.s’ test into such a form that onw the lengthy proerw **f rmn- 
puting the kq variances anil \k(i[q ~ 1 ) covuritiiu'W hm hccii carried out , reiat ivcly 
little further labour in requinal to obtain a test criterion w inrh may la- referred 
with practical accuracy eitlicr to Fialjcr'K j-tahlea or to the Tnhh* nf ih* hvmn- 
pkle Beta Function (K. PearKun, 

[b) In the case wliere the Katnplc sizes differ, to suggest nn alternative pro- 
cedure which is accurate when ilealing witli large Miiiples, 

It is of course, always open to (pieHtion how far a single roin{iri»}»f>njii%*f' 
criterion is satisfactory in a complex problem. Certain poijtts sbonhl, lunvever, 
be remembered. In tlie problem referred to above, dealiiig with the ineaiw only, 
the usefulnesB of a single criterion has been widely reeognized, It. when tip}thetl 
to adequately large samples, aauitalily chosen comprehensive entenum shows jji* 
significant evidence of lack of homogeneity among tlie varituiecs and rovarjani'i^s, 
we are saved the lengthy process of making many indiviilnal roinpariwinn, If, 
however, the criterion falls beyond the signilicance level, it will Im' ne»''e.«.iiary, 
as when dealing with the means, to make a more delailiHi analysis in nrder to 
locate the source of (lislurbancc. 

Finally, it may he noted that in the ease of a single variate p/ 1 1 1 he problf m 
is much simplified. A full discussion of two tests aviiibibk* in thift rim\ the 
Neyman & Pearson Li test and the Bartlett // tt*st has recently bwn published 
elsewhere (Bishop & Nair, Ulllil). 


2. Wilks’ oenbralizkd LiKKLiiionn c-jutkkiom 


It is proposed first to define this criterion in the simjile caw when the aamples 
from each of the k populations are of the same size, «; to quote the Rampling 
moments derived by Wilks; to give the form of a working teat devekqwd in the 
later pages; and to illustrate its use on an example. 

It will be convenient to use the following notation, We ahall cinmiflcr k 
samples, each of size ?i, drawn from populations of q variates. 'I'liftt is to my. the 
total number of individuals sampled is kn « A' and for each indi vjtliml q rim wders 
are measured, so that the total number of observations k Nq, k-t in* the 
observed value of the character s of the ith individual in the tth wimple. 

Then 5 = 1, 2, q]t=l, 2, k; and i = 1, 2 «. 


Let 


1 ^ 

■ ■" X 

U|»l 


be the mean value observed for the character « in the Ith 

1 " 

* The use of the divisor n instead of the more usual n- 1 

constant in the expression for Aj. 


Mow put 

for a maWplying 
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where'. ,<), u -^^1,2 q. Then if ,5 = u, is just the variance of observations of 

eharaeter = u within tins ith sample, whereas if s^ u, Wju^is the covariance of 
the eharaciters h and v. 

'rh(! j^ameralizetl variance in the itii sample is defined by the symmetrical 
(letenninant 


( 2 ) 


Then if 


^hw' ^h2/> 

■ ^ 1 '// 

'Tsi' ’’■iiti ••• 

> "^Hql 


1 

11 



( 3 ) 


the likelihood (u-iterion apjiropriate for testing the comprehensive hypothesis, 
say //(), that all the (corresponding variances and covariances in the k sampled 
populations are equal is 

II 






TiHl 






DfiaTninlnc 




power of Ai, rather than that of itself, owing to the extreme skewness of the 
latter distribution. Within limits there is some cdioiee in the power of A^ which 
may he selected to give a convenient sampling distribution. Throughout this 
paiior we shall follow Pearson & Wilks (1933) by using the l/iVth power, so that 
the criterion for use in practice will be 




n 


(5*) 


If //(), the hypothesis tested, be true, the /tth moment about zero of the sampling 
distribution of k has been given by Wilks (1932, j). 490) in the form 


™ K rTiH(^“«)jjfc/'(|(A-(n™ 1) + 1 +k-s)j ‘ 


W 


As a result of trial in a number of cases it seems probable that the Pearson Type I 
distribution in the form 

<’> 

will give a good approximation to the distribution of Ij, if the parameters Ui and 

♦ Note that {l^{q = 1)}® = L^, in the original notation of Neyman & Pearson (1931). 
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m, .re eo chosen thnt the lirst l.-o mn.nmls of (7) sgifc »-«l, the trne vshiee n. 


given by (6). If this is done it i.s found that 



^ l‘i ' 

m 

and 

„ (1-/^1) (d'l-'/fo) 

tHo = » 

® th 

ifii 

where 


......{Hll 

/.i[ and /t' being respectively the first and second moments about r.ero td i,. I* rom 
the relation (d) it is seen that their valuo.s are 


|(/’(I(« h D 'i I --dll 

I ■ (/’(Ifn --«)))*' /’(iW't"" IM I 

(HI 

and = kf‘ II 

.1 i 

(/’(l(7t + 2/I--.S)))* /’(|(I-(;t - 1) + 1 -d)l 

7 (/’(i(n-s)))'' r[\{k{>i-i)i‘\-m 1 ■ 

(1^1 

The hypothesis //(, will be rejected when is excc]itittnidly Inv 

c. The proh- 


ability level may be obtained from the Taldn of the Inmnphii' Uda Fnnrlion. 
Alternatively T) % and 1 % IovcIh for li can be obtained from R. A. Fixlicr'n s* 
tables (19386, Table VI), by writing 


where z has degrees of freedom 

/i = 2w 2, /j = 2mi.* ,.,,.4 Ml 


It is no doubt the labour involved in laileulating the momenta and t luU 
has discouraged the use of Wilks’ test. As will be shown in the later KuctimiH of 
the paper, if n is not too small,! the following empirical relations may be used 
with sufficient accuracy for practical purposes to express the 7?ij and of (t4| 
directly in terms of (a) the number of variables q, (6) the sumpli! sixe n and (r| the 
number of samples k'. 

mi = k{n~q)~im(k-\) (i)0«. 3!tf / 1 %»), (If.) 

Q-25{k~\)q(q+l). .....4lb) 

These formulae are found to give satisfactory results for all values of k, over 
the range considered for q, namely 1 to 0. 

In the following sections of the paper we shell first give an example illus- 
trating the practical use of the test and afterwards justify the approximations 
which have been made. 

* Since in general A and/, in (14) will not be integral, it is neeessary m interi>olate in the 
tables for fractional degrees of freedom. 

t Certainly if 20, and probably if 10 provided q<5. 
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3. iLUrSTIUTIVIS EXAMPLE 

The datii uaed were taken from the a})pendix of a eraniological study by E, 
Pittard (1909) and consist of measurements of skulls of males found in different 
localities in the Rhone valley. The oharacters considered are: 


(1) Length of skull in mm., denoted by L. 

(2) Breadth of skull in mm., denoted by B. 

(3) Height of skull in mm., denoted by H. 

(4) Breadth of face in mm., denoted by F, 


The skulls are divided into five groups of thirty skulls each, according to the 
neighbourhood in which they were found. The five groups are: 


( 1 ) 

( 2 ) 

(3) 

(4) 

(5) 


Skulls found in the village of Biel. 

„ „ „ „ Naters. 


JJ >1 >» >> 

n )) )) >) 

n M a 


Viege. 

Rarogne. 

Hierre. 


It is desired to investigate if the population variances and covariances of the 
measurements of the four characters may be taken as tlie same for each of the 
five populations sampled. This is equivalent to asking whether the population 
variances and covariances differ from village to village. 

With the notation previously employed it is clear that in this case n = 30, 
A: = 5 and q = 4, The sample variances and covariances, as given by (1), were 
calculated in the usual manner. Their values, together with the sample corre- 
lation coefficients, are shown in Table I a. Using these results, the values of the 
generalized samjrle variances ju*,,/!, which are also given in Table Ia, were 
calculated from equation (2). The substitution of these values of | into 
equation (0) gives = ()•72.'5, 

The probal)ility levels of the sampling distribution of are given by (13) 
and (14), where 

= 12 (i-H 80 and wia = 2()d)00 

arc determined by substituting k - 5, n = 30 and q = 4 in the empirical relations 
(15) and (16). The levels are found to be: 

6 % level of z = ()■ 1830 and 1 % level of z = 0’2573, 
hence 5 % level of l-y = ()'815 and 1 % level of Ij = 0’791. 


The caloulated value of li beiiig considerably below the 1 % level, the hypo- 
thesis that the variances and covariances are the same for each of the five popu- 
lations sampled must be rejected. 

Further tests may now be made in order to find where the lack of uniformity 
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occurs. To clooicle whether the variances differ we may ajijily fcuir sin^k* variate 
tests, treating the observations of each cimraeter ao|mrately. When tliia is 
done the results following in Table Ib are obtained. 

The tables of P. P. N. Nayer (1936) give the 6 % level for L^, when n » Sdaml 
k= 5, m 0-936, so that there is no evidence to suggest that the vnrianws <d" 
measurements of any one character differ Bignifieanlly from mnnide to wtnijtle. 

The lack of uniformity must therefore occur in the comdatiorw. that it i« 
necessary to examine the variation in each of the si-x «d:« of live comdation 


TABLE I» 


Gharwitcr 


L 

0-985 

B 

0-967 

H 

0-964 

F 

0-974 
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coefficients. If r be a sample correlation coefficient calculated from n pairs of 
observations it is known that 

2' = i(loffe(l+^)"logc (!-)•)) 

is approximately normally distributed with standard deviation the 

mean of z' being a function of the population correlation coefficient and the 
sample size n. Consequently, we may test whether correlation coefficients 
= 1, 2, each based on n pairs of observations and obtained from k 

independent samples, differ only through cliance fluctuations from some 
common population value by calculating 


where 

and entering tlic taldes of the integral with fc-1 degrees of freedom. A 
signifie.antly large value of will indicate that the k samples cannot be con- 
sidered as having been drawn from populations with a common value of p. This 
procjtHlure was carried out with tlio results shown in Table Ic. 


TABLE Ic 


(\irr«lation between 
uharftctera 

t 

Remarks 

L and B 

M-3] 

Significant 

L // 

7-82 


Xj m X 

20-47 

Significant 

li „ n 

11-87 

PiDbably aignifuaint 

n „ F 

H-ai 


H „ F 

12-08 

Probably signifioant 


The iH’obabiiity levels fertile distribution of y® with/ = 4aro;5%level = 9*49, 
2% level ™ 11'07 and 1 level -- 13 - 28 . It is seen that all the six calculated 
value.s of y® are above the expectation value of 4. There is significant variation in 
tlie correlation coefficients and r/,/,-/, whilst it is largely a matter of personal 
opinion as to whether the suggestion of lack of uniformity in and r,;;,. 
shall be judged significant or not. 

To summarize: analysis of tlie data leads to the following conclusions: 

(а) The comprehensive test shows evidence of significant variationfrom sample 
to sample of the variances and covariances. 

(б) This lack of uniformity is not due to differences in the population vari- 
ances. 






38 A. Test for the IlomogmieUy of Vnrinn(r,H and ('omrknws 

(c) Tliere is clear eviflrnee of heterogeneity amot»^' sonic iil the curichiiiMii 
coefficients, in particular lor those liclween length and hveadlh of skull and 
between length of skull and breadth of face. Hefiiills ot this kiiul an* Ircpucnlly 
met with wlien dealing with craniologiciil data, the coiTcIatioii cocirn icnt oltcn 
being subject to a eonsidcrahle degree ol instability, 'i lie meaning to be attached 
to these fluctuations in the correlations, when considen'd troin a I'raniohtgical 
viewpoint, seems to he somewhat ohsimre. 

The above example may give the. impression that a great deal ol latamr is 
required, even after the conijirehensive test Ls used. If, however, the compre- 
hensive test had provided no CAddcnco for rejecting the hyiiof hesis //„, it u ouid, 
of course, not have been necessary to make the four .Hingle-variate Lj tests am! 
the six correlation tests. Even when the hypothesis //,( is rejected and if is 
necessary to apply separate tests t.o find the causeH of the lack of uniformity, the 
only labour which will have hcen wasted is the relatively small amount involvetl 
in the calculation of the deUirminants ] The really lengthy eumpntation is 
that required to obtain the sums of squares and sum.s of produets tin wliieh 
Table Ia is based; this cannot he avoided if a detailed analysis is desired. 

4. The ADEQtTACY OP TUE Type I Ai>j>Roxt.MA'noN 

Tlie work now falls into two stages, which will lie eontnfrned with 

(a) the adequacy of the Type 1 curve of c(|Uation (7) to repreHCiil tin* unknown 
true distribution of 

(b) the accuracy of the empirical formulae (15) and (1(1) for nij ami mj. 

The hope that the Type I form of curve might give an adequate approxi- 
mation was based on two main considerations; 

(а) Values of are restricted and can only lie between zero and unity. 

(б) On other occasions the use of the principle of likelilumd lum yielded 
criteria which are either exactly distributed in the Type I form, or are mj dis- 
tributed that a good approximation has been tihtalnahle by the usit of tliia kind 
of curve, 

In order to eomparo tlie true distribution with the a[»proximate form, the 
first four moments of both distrihutions liave been calculated in a number of 
cases. The manner of choosing the ajiproximate disl rilnition ensures that its first 
two moments will agree with those of the true distribution, ami an idea of tljo 
accuracy of the approximation may be gathered by comparing the third tnul 
fourth moments of the true distribution with the corresponding moirmritw of the 
approximate form. The distribution is such that a compariaon of moments 
calculated about the mean is easier than that obtained when moments taken 
about zero are used. The oases considered fall into two groups: 

(i) k = 5; 91 = 10, 20, 40 and 50; ^ = 1, 2, 3, 4, 6 and (J 
and (ii) n = 30; fc = 2, 6 and 10; (/ = 1 , 2, 3, 4, 5 and 0. 
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The process of calculation was as follows: 

(а) The first four moments about zero of the true distribution were obtained 
by using 10-figure tables of logarithms of the Gamma function (E. B. Pearson, 
1922 ). A large number of figures were required in the values of these moments, 
since the dispersion of many of the distributions was small. 

(б) The values of /ig, /tg and the second, third and fourth moments about 
the mean, were then obtained in the usual way and the constants and 
were calculated by using (8) and ( 9 ). 

(c) The third and fourth moments about zero of the Type I distribution ( 7 ) 
are given by 

ffliK+1) (mt-b2) K + 2 ) , _ _ _ 7, 

^ ^ (nq-l-ma) (% + ?»2+ B (™l+w^^ + 2) {vi-^ + 7 n^ + iy 

^ ^ftiK +l)K + 2) (% -t 3 ) ^ + 

(wq-t-mo-f 1) (j/q + j/q-f 2) (w/j-f wq-i- 3 ) + + 

( 18 ) 

so tliat, using values of r/q and ni., already obtained, these may be calculated. 

{d) .Prom the values of and [/tj the corresponding momenis about the 
mean, i/i^ and were calculated, Comparing /q and /l^ with j/q and 1//4, 
respectively, gives an idea of the adequacy of the a])])roximation. 

Table II gives the results obtained; in addition, values of fli - fill fil and 
/I2 = fijftl are included as measures of the skewness and kurtosis of the distri- 
butions with which we are dealing. 

It is .seen from these tables that the distributions represented vary widely 
in shape, l)ut as far as can l)e judged by comj)arison of third and fourth 
moments, the use of the Type I approximation would seem to be quite satis- 
factory in every ease considered. Tlie oases dealt with cover a fairly wide field and 
have not been chosen with a view to obtaining esiieeially good agreement of the 
moments, 

The inclusion of curves for which q = 1 may be questioned having regard to 
the fact, already stated, that the single variate problem has been fully treated. 
We must therefore remark that, as /./, “• {/,p/ “ I )}“, the fact that the. distribution 
of Li may be refircscnted by a Type 1 curve does not inijily that this kind of 
approximation will work adetjuately for the distributitm of ij = w- Moreover, 
some of the distributions ol)tained by putting q ™ 1 are almormally skew and 
highly loptokurtio. As it liappens. when q “ 1 the TyjjC I curve apijears to give 
as good an agreement as for other values, but the exclusion of this set of distri- 
butions might have had the effect of making the approximation appear better 
than was really the case, in that it would have removed a number of the "more 
difficult” distributions. Thus, although the criterion will probably always be 
used in preference to l^iq = 1) = it seemed desirable to include distributions 
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Ifc will be noted that there are aet-tioiiH common to thi' thnn* )«irta ni 't'alili’ 11. Thi ** lone larN 
included more than once in order to faoiliUitc (’oin|iariiHmH in any oiif [wirt of the talilr*. 


for which q ~ I wlien diHcniHsing the adetjuacy of tlie Tyin* I aiijiroxinwtimi <«» llie 
distribution of 


Consideralmiaf an allmialive form of approximatim 

Eor t!vo larger values t)f n which have been wmsiilered it i« wn ibal the 
distribution of is of small diHiieraioii and is situatoil i-liwe lo miity. the tipfam 
limit of possible values of li- In fact, in many of the etwcH dealt with, the mean i» 
separated from the start of the curve by soitte twenty or thirl y 1 iim*s the rtandartl 
deviation. In these circumstancCH it Hcenied poHsihle that a better tijsprtjxi- 
mation might be obtained by taking a Pearson Type I curve in the form 


P(h) ■ 


r^i+nii) 

(1 - 


^{1 for/i^fisc; 1. ...{11)1 


There are now three assignable parameters; b, defining the start of curve. Wj 
and Wj, which may be selected by equating the first three moments of the dis- 
tribution (19) to those of the true distribution of Ij. As before, an idea of the 
degree of approximation may be obtained by comparing the fourth moment of li, 
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calculated from the mean, with that of the distribution (19). This procedure has 
been carried out in a few cases, the results obtained being set out in the upper 
part of Table III. 

As far as can be judged l)y a comparison of moments, there i.s little difference 
between the approximations obtained by using a Type I curve in form (7), 
which is fitted by two moments, and the Type I curve in form (19), which is 
fitted by three moments. However, as we are primarily concerned with the 
probability levels of the distribution of the effect on these levels of the change 
in the method of approximation must be ascertained. 


TABLE III 

Comparison of dislrilndions fitted Inj two and three moments 



11 = 30 

a = 30 

11 = .30 

71 = 30 

n = 30 

m = 30 


fc = r> 

k = ti 


k = ri 

)k = 6 

fc = 6 


y = l 

,/ = 2 

7 = 3 

7 = 4 

7 = 6 

7 = 9 

( Of true iliHtributioii 

frU 

29 

90 

198 

341 

480 

/tj X lO* 1 Of (liHtributiuii (7) 

fi-OK 

2!) 

!)0 

198 

33!) 

481 

lOf distribuliion (19) 

fi-OH 

29 

90 

198 

339 

480 

'O'- fOfclmtributiuii (7) 

>> lofdistrilmtioiiflit) 

()-l)«74 

0-9280 

0-87(17 

0-8149 

0-7449 

0-0678 

i)d)()74 

0-9280 

0-87(i7 

0-8149 

0-7449 

0-0977 

!«' W,.l jOfdiHtriliutioii (7) 

■" \()f distribution (If)) 

0-i)r)4(i 

0-9110 

0-8693 

0-7916 

0-7192 

0-0411 

(l-OMO 

0-9110 

0-8692 

0-7916 

0-7192 

0-9413 


The method of obtaining the 5 % and 1 % levels when the distribution is in 
form (7) has been given. When the second approximation is considered, the 
transformation 7 


gives the distribution of t as 

HO that, as before, the tables of z may bo utilized in calculating the levels. The 
lower part ofTable 111 comi)aroH the levels given by the two methods of approxi- 
mation, 

It is seen that, for all practical purposes, the probability levels as given by 
the two forms of Tyiie I curve may be taken as identical, so that nothing i,s to be 
gained by using the more complicated distribution (19), Furthermore, the agree- 
ment obtained strengthens tlie conviction that the Type I curve in form (7) 
really provides a good approximation to the true distribution. From now on- 
ward, therefore, M'e shall only be concerned with the approximate distribution 
given by (7), 
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6. Direct comparihosh in caseh where the 

TRUE DIHTBUIUTION OK i!, IS KNOWN 


Wlieiv only two groujiw are eonaidered the diatrihutiwi of l^ in known in tin* 
cages of one and two vnriateH. TIigho known (hstrilttitionH iiwy be nwtl to frnt f lie 
accuracy of the prolialnlity levela as nhtiuned l»y using tiie Iyj>«‘ I dirve. If 
g = 1 we have seen that If = L^. P. P. N. Nayer (lHUli. i*. has siwwn thiU, 
when ^ = 2, 


2" Mi' 




rill-1‘1 

Now, — n-rtu* , ' 


Hence, 1). i). wherex « (ifj* (2I| 

Wlien q~2 and k « 2, Pearson k Wilka (15)3;t) Iiave proved that 

no - joi _ ^ j (louii 4 1„„ ( ^ V I * 1 ! ij / ( H • 4 1 

where a: = (i?)l |2-i| 


Thus if ^ = 2, relations (21) and (22) enable the (irobability integrals of i, (y 1 ) 
and li(q - 2) to be obtained by using 7'«We« of the /nmnplrli' Htla Ftmtimi. 
In this manner the true probability associated with each of the 5“„ and 1 
limits, obtained by using the Type I approRimation, may he caU*ulHti*«l. {ieHult s 
obtained are given in Table IV. 

For the values of n considered it is seen, from Table IV, that tht? irue jfrol)' 
abilities are very close to the desired values O-OS ami (Mil . It slnmld 1« [whuta! 
out that, as is shown in Table II, the six distrihutions coiwidcred above aw* 
unusually skew and leptokurtic compared with the nmniinder of the eliKtri- 
hutions, and moreover the agreement of true and approximate distriimlion. an 
judged by the moments, is not noticeably bettor for these (mrves than for the 
curves in general. Hence, for diatributions wliieb, judging from the values of 

and are certainly not in any way favoured, the limits «t by using the 
approximation are found to give true probabilities near wHiugb to the desinHi 
values for all practical purposes. 

To summarize the salient points in the preceding seotioiw, we have seen that: 

(а) For a wide range of distributions close agreement between the third and 
fourth moments of the true and approximate distrilmtiorm » obtoinetl. 

(б) Type I curves in form (19) fitted by three moments, which might be 
expected to give a better approximation, lead to probability levels which are for 
practical purposes the same as those given by the Type I curve fitted by two 
moments. 
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(c) In cases where the true distribution is known, the limits set by the approxi- 
mation give values of the true probability near to those desired. 

We may therefore say with some confidence that, having regard to the above 
considerations, the use of the Type I curve in form (7) as ari approximation to 
the distribution of seems to be amply justified, 


TABLE IV 




g = l 

g=l 



1: = 2 

k = Z 



n = 30 

11 = 60 

Type I 5% limit 

True probabiiity 

0-81)36 

0-960!) 

0-9804 

0-0,5(K)6 

0-05008 

0-04060 

Tyjje 11% iimit 

Truo probability 

0-8236 

0-11435 

(l-!)064 

0-00«l)8 

O-OlOOl 

0-(ll002 


? = 2 

g = 2 

? = 2 


fc = 2 


k^2 


n = 10 

>i = 30 

11 = 60 

'rypi' 1 5 % h'uit 
True probability 

0-7814 

()-!)324 

()-90(K) 

0-fl6(KI5 

0-041)1)7 

0-041)79 

Ty[K! T 1 % limit 
True probability 

0-6981) 

0-lK)34 

0-l)425 

()-(K)l)0!) 

0-(MI!)!)n 

0-(M)9tl8 


(). Thk limiting form of tiiu distribution of 

As was the case in the single variate problem (Ncyman & Pearson, 1931), 
the distribution of in large samples may he obtained approximately from that 
of the Ml moment of the sampling distribution of = If, may be 

obtained from (0) by replacing h by Nh. Using iStirling’s approximation, 

it may readily he shown that M ,|(Ai) tends uniformly to {1 + as n tends to oo, 
for all h ^ 0, wheTC 

/=P-1)(/P/+1). (23) 

The distribution of y® with / degrees of freedom is 

PiX^) = I’(A/)}~^ for (24) 

If we put y = (so that 0 < y < 1), the Ath moment of y is 
ri /•« 

^a(2/)= y’‘Piy)dy = {iifrm-^ (y2)i/-iMV(H«d(;^-*) 

Jo Jo 

= (l+/i)--‘/. 
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Thus the eorreajioiKling luonieiit.s of the (liHtrihutioi) of i/ and tlie lifuiltiig ff»nH 
of the distributinii of /\i are equal. The range of [mxifile vainer of f }«• viu ia!ile»t 
being finite, i.o, from zero to unity, the equality of moineiitf. i.^ Rtiflirionf f« «iwm* 
that the distribution of i; and the limiting <liAtriltwt«m ol A, arc iilctstiral 
Therefore, in large Hiunplas, -2 log,di = -2.V logv/j will he ii.|qtrns»iia!ch- 
distributed as / with/ degrees of freedom. Thus if 

12 .*',} 

it follows that the probability levels for /j may be ohlaitied by iriwting in fiSl 
the corresponding levels for y*. It had been hoped tluil this mclhfal of cidfu- 
lating the probability levels might prove satisfactory ftn* iiualerAtcIy largo viil«c>s 
of 71 , so that the labour of calculating the first two mo)neitt.K of I, rould have Iwit 
avoided. In actual fact, liowcver, tliis hope is not mUirr'd, as is slunvn by the 
results given in Table V. 


TABLE V 




'i:r\ 

k s5 n , 

n=.30 

1 

(1) Level from Type. I 

(2) Level from y’ 
Difforenco(2)-(l)dividetl 

by /(/<■,) 

iin.' lo; 

'* ,0 ‘ ,11 

tti)ii(i9 ndMsri 
0-flfifid Odl4R2 
O' 13 0'22 

fld- in 

" ,(( ‘ ,n 

1M)B74 O-iIMB 
0-9B89 0-H5fi7 
O-Ifi 0.22 

2 

(1) Level from Type I 

(2) Level from y’ 

Differonoe {2)-(l) divided 

by 


0-92H{) 0-9110 
0-9323 0-9163 
0-28 0-32 

3 

(1) Level from Type I 

(2) I,fivel from y* 

Difference (21 - (1) divided 

by/W 

()-8913 (tH57(( 
0-1)0(14 ()-8e93 
0-31 0-4() 

0-8767 0-8563 
0-8857 0-8(Hyi 
0-39 0-45 

i 

(1) Level from Typo I 

(2) Level from y* 
Differenoe (2)-(l) divided 

byVW 


0-8149 0-7915 
0-8304 0.8087 
0-55 (1-6! 

6 

(1) Level from Type I 

(2) Level from y’ 
Differenoe (2) - ( 1) divided 

by V(ft) 

0-7884 (1-7476 
0-8119 0'77S0 
0-53 0-62 

0-7446 0-7192 
0.7684 0-7448 
0-73 0.79 

6 

(1) Level from Type I 

(2) Level from y* 
Difieteivce (2) ~ ( 1) divided 

by d(h) 

— , — 

0-6678 0-6411 
0-7014 0-6789 
0-94 1-01 


k n 

i- 


i- 

10 

it 

■ 


.Vi 


3*5 

' 



t N 


[ft 

i ,1 • 

*’ !♦ ^ U 

•’ it 

» (01 

•» rt 

0-117.57 0-1MW2 

O'lWI" 

« 07,11 

f*i5*585 

**4W2!5 

0-fl7l«i IMlfi7.'i 

051812 

5»-!n,'W 

»5'tt722 

55 !«i45 

0-12 o-l.-i 

O'lll 

55- 12 

55 Is 

0 22 i 

0-9463 0-933« 

041,773 

04(47«» 


il 

0-948S 0-ll3(Hl 

0-9.WS 

55 Wall 



0-20 0-24 

0-15 

oqti 


i 

0-iKi7U 0-KSI22 j O-WIBS 


5*8783 

55Klt.f3! 

0-11130 0-K!lHl 

0-029.8 

iJDlfU 

1 * *55517 

O'Hum 

a-Hd im 

51-24 

H.2.8 

5|.V5 

55 » 

0-8610 0-»12« 

O-HtPii 




0-8698 0-8528 

0'«W5 

0-8*«4 



0-41 (t47 

5t'33 

55.38 



0-8067 n-7HB7 

0-a446 O-HtBl 



0-82(16 0-8018 

0-8S7 n-KMl 


■ 

0-S5 O-BO 

0-44 

0'4« 


" 

0-7462 0-7244 

0-79M 0-770! 


.... i 

0-7666 t)-7463 

b-W 

0-"313 



.... i 

0-72 0-78 

0-64 

(1-68 


1 
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It is evident from iimpectiou of Table V that, as would be expected, the 
agreement between the levels, as given by the I'ype I and approximations, 
improves with increase of ?i, whereas it slowly becomes worse as k is increased. 
However, as q increases there is suc;h a rapid deterioration in the accuracy of the 
approximation that this method cannot be justifiably employed, except for 
large values of n. 

It may however be usefully noted that the levels obtained from the trans- 
formation are always above those set by the TyP® I curve, so that if the hypothesis 
is not rejected when the level given by is employed it certainly would not be 
rejected if the more accurate Type I level had been calculated. 


7. Empirical rislations for vii and mg 
When using tlie test it is found that the calculation of the first and second 
moments of accounts for tlic major part of the labour of computation wdiich 
is involved, It therefore seemed desirable to attein{)t to obtain empirical relations 
giving m, and m.g iii terms of n, k and q, such that the use of tliese values will 
lead to probability levels of sufliiiicntly accurate for jiractical purposes. 

We liave seen that tends uniformly to as n tends to oo. 

Put l[ ~ where M is any po.sitivc number. Then the Ath moment of 

l[ tends uniformly to (1 as N ->ao. If a Tyi)e I curve be fitted to the 

distributiou of l[, in a way similar to the one employed when approximating to the 
distribution of li, the exponent of the power of (1 - 1[) will be given by 

' 

and this may lie considered a.s a function i//(il/, W), of M and N. 

Now, 


lim N) = lim mj, 

N -yoo i\->m 


{!-(! + llM yy\{{\ + I/M)-l /-(i + 2/Jf)->/} 


tlie limit being approached uniformly for all positive M, and lim <j{M) = tj/. 


Hence lim N) = lim -- hf = ()'25(A~ 1) q{q+ 1). 


1 


Values of mi and m^ which have been calculated in a number of cases are 
given in Table VI. I'be values of indicate that the limit I/is approached rapidly 
with increase of w, and it seems probable that, for n Sj 20, no error of practical 
imjiortanee will be made if we substitute If for 

Methods of aptiroximating to mj must now he considered. U. S. Nair (1938, 
p. 285) has shown tliat the distribution of may be reduced to the form 

Pih) ” Piih) P%{h)> 
where lii(fi) = const 


and 


Piih) 


I 

ini 


lo; q 




■l)+l-j+t)) 


dt, 
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TAliLK VI 



I VdiO'.s of nil 

ui 

.. 



9=1 

II *> ‘ 

r-3 

. - 

q-^i 

r {.. h 

■ i 


l; = 2 

17-361 

16-500 ■' 

i 




n=Ul 

0-601 

1-601 , 

1 


} 


k ^2 

67-282 

53-688 ! 

63-732 

6M1S3 

•|!M32 ,1 

« !8*7 

ji = 3(l 

0-6(K) 

I -5(31 i 

2-0181 

-f-wni , 

T-.Vtl i 

lo .71 1 

k = 2 

07-20!) 

1I6-58H 



1 

J 


n - 50 

0-600 

l-r4HI 

., ,..-.4 


’ 

- ■ I 



42-097 

38-210 

1 

33- m 

27-627 

1 

21 7rt5» 1 

in 1 7s 

n = Id 

2-(KI5 

(MHH 

12-tllK 

2JM80 

31.»22 ; 

4.6 1*63 

<!«5 

02-601 

HH-18K 

83-063 

TT-imt 

Tn-768 

03 012 

a = 20 

2-(KH 

(l-(KKt 

12-(KH 

2<i-030 

30-17" 

42 (¥»lf 

fcs! 6 

142-660 

13K-177 

133-033 

127l«M 

121*430 

ikm'h; 

n s 30 

2-000 


12-tHll 

2<MIH 

3o-i»7l 

42 2M 

4 =» 5 

192-543 

188-178 

183-014 

177*043 

17i*-2h3 

lft3 .h-K* 

n » 40 

2-(KK) 

«-(KKI 

12-(8K) 

anxw 

W.»n3« 

42-127 


242-632 

238-176 

233-020 

227-017 

22o-iltl 


U!=50 

2-(KK) 

fl-!KK) 

12-(HI1 

20-006 

mm 

42-f*7.6 

fc=l{) 

2M-8H9 

276-83(1 

206-206 

262-81 i 

238-«!l 

«■»«» 

n = 30 

4-601 

13-501 

27-(KH5 

460.111 

(17-710 


I; SB 26 

711-207 

088-001 





n = 3() 

12-(KM 

1 „„ 

36-(KI7 

Pii-wvifwffiiin j , 


1 



In oaoli coll mi is tho upiHsr and the lower numher. 


SO that pSi) is independent of n. Now if |/ replaces m^, tho Tyjw t dirt rilnit inn 
(7) takes the form 

pdi) ~ const Zf 1 -^( 1 -Jj)!/ *, 

so that it is expressed as the product of two factors ifr-i and (1 K the 
second of winch is independent of n. In these circumstonoes it neeinecl |Kiwihle 
that wii might perhaps be replaced by k{n-q). Therefore putting 

k{n~q) 

% = 0-25(fc- l]q{q-\- 1), 


and 
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the 6 % and 1 % levels of were calculated jii a number of cases and compared 
with those given by using the values of and obtained from the moments. 
The results given in Table VII indicate that, although the agreement in cases 
where ^ = 1 is satisfactory, for larger values of q the discrepancies become to(5 
large to enable this method of approximation to be used in practice. 

TABLE VII 

Comparison of l&veh given by Method 1 and Method 2 




5 = 

1 

■ 



= 6 


Method 

.■5% 

1 % 

5% 

1% 

r.o/ 

/I) 

1% 

k = 2 

1 

0'0(i7 

0-044 

0-801 

0-858 

0-788 

0-748 

n = 30 

0 

0-!)ti7 

0-044 

0-892 

0-858 

()-791 

0-750 

k = r) 

1 

O+lMi 

0-8.')7 

0-021 

()-570 

0-304 

0-203 

n ~\0 

2 

O'OOl 

0-804 

0-035 

(>•585 

0-340 

0-304 

A- = 5 

1 

OdISO 

0-031 

0-814 

0-784 

0-624 


n = 20 

2 

(l-O.W 

()'933 

(>•818 

0-788 

()'(!4() 

()-007 

A = .') 

1 

O'907 

O-OSI) 

0-877 

0-8.50 

0-745 

0-710 

!! = 30 

2 

0d)(W 

0-056 

0-878 

0-858 

0-752 

0-728 

A ss i) 

1 

0-07(1 

0-060 

0-9(18 

0-81)2 

0-807 

0-787 

n » 40 

2 

0-07(1 

0-907 

0-900 

0-803 

0-811 

0-702 

A = 5 

1 

(l-OHI 

0-073 

0-027 

0-014 

0-845 

()-828 

11 = dO 


0-081 

()-973 

0-027 

0-015 

0-848 

0-831 

A = 10 


n 

0-003 

0-878 

0-8(14 



n s 30 



(1-002 

O-HHO 

0-8(i0 




Mclhcxl 1, w, and m,j. ohUuiu’d IVoiii U«‘ lirat two rnmnwitfl ot7i, 
Mdhcd 2. »«, and Wj Riven by tiu' (‘inpirical ndatinim 
m, == kin~q), 1) 5(5 + 1). 


It is clear that the deviations are due almost entirely to an inadequate 
approximation to Wj, although the expression docs not appear to differ from 
in such a marked manner that its use as a basis of approximation must be rejeoted 
entirely. Thus it was thought that some slight correcting term might be sub- 
tracted from k{n - q) to provide a better approximation to With this end in 
view values of the necessary correction 0 - k{n~q)~-in^ were calculated and 
are given in Table VIII, 

Biometrika xx.xt 4 
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TABLE VIM 


Valiic!^ uf eorrrrHoii ■ iiii-yl »«j 





(/ 1 

k = 2. « = n> 

((■(if) 


(1-72 

^ = 2, n = .W 

(J'73 

k s= fi, II == 16 

2'3(l 

k = ri. 11 ^-.1 2(1 

2-41 

k « fi, n - :)6 

2-44 

k = .'i, II => 40 

24(i 

fc « 0, H K 'lO 

247 

k » 1(1, It « 3(1 

.7'3I 

k « 23, II = 3(1 

13'71l 


(/ -. 3 j II * 


(t4(( 


1141 : 

i)-27 <( 32 

(141 


- -j 

1-72 

I-«» i 2-47 

l-xt 

( 24t 

1-H2 

l‘it7 i 2-01 

l-a2 ; 

MCI 250* 

l-«2 

l-VCt 5 2'W*< 

i 

4 1(5 

" • 1 

4-7!» i "-Itl 

114(1 

; 




>/- 




.« '.a 

i 

ill 

SI 


IJH 


i 

f ■" *» > 


I ist 


:iv3 

f, tfi( 

f'l S} 

•J 17 
7 in 




When 20, the variutiou «f C with n. fov lixnl k an'l is siuttii . ?4i< thni m 
seeking a 8im\)ie empirical relation for (\ functionK o( t anti »| ••nly wen* t'i»n. 
sidered. If, for constant <], (' he plotted ugaiuHt k\ if i« funtjii that fltr« ndafittn 
between them is nearly linear and. inoiwver. f ntny he taken m projK*rft(»niiI to 
•(fc-l). Assuming therefore, that (’ -- [k~ \ vahioH of fnj- ditlen'id »/ way 
be calculated. On plotting f),i againattj. the form of the re«»lfin« t'orv e »«»wge»{K 
that, over the range of values of i} which aw corteitiered. will he given with 
sufficient accuracy by a ipiadratic form in q. Making this awnuiipfion the eoeffi 
cients of the quadratic wove adjusted hy trial and error until the cspn’asion w hieh 
resulted seemed likely to give satisfactory results in moKi civstw., 'rin* rorr«‘fting 
term was thus tentatively obtained in the form 

(J = CH)l(A-~ I) (51(1 -:«(/ + Ik/® j. 

The final replacement for Wj, as previously given, is flmrefort* 

Ml =» k{n~q)-(l'iil{k~ 1) (tm- ffik/ + ihfi, .|}fd hiM 

and as before = ()'25(fc - 1) (/(f/ f 1). ( 1«5| I, jit 

It is realized that the reader may have some doulits as ti» the validity of the 
empirioal methods by which this approximation for wij was obttijnwl. In order 
therefore to establish this approximation as a satisfactory practical iiirthtMi. it 
is necessary to calculate for a wide range of cases the 5 % and 1 levcla given by 
the approximation. These may then be compared with the levels obtainwl when 
values of and mj calculated from the moments are used. This has bten liom; in 
a number of cases, the results being given in Table IX. 

^ Inspection of this table shows that when n ^ 20 the levels obtained by 
using values of and as given by (15) and (16) agree with tlnwe nwultlng 
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TABLE IX 


Gomparmn of lemk givm by Method, I and Method 3 





7 = 

1 

7 = 

2 

9 = 

■3 


4 

? = 

= 6 

tf = 

= 6 

k 

n 

Method 

6% 

1 % 

6 % 

1 % 

5 % 

1% 

B 

1 % 

5 % 

1 % 

6 % 

1 0 / 

/o 

2 



0 - B 94 

0-824 

0-781 

0-699 

_ 





■ 


__ 




0-894 

0-824 

0-780 

0-698 

— 

~ 

— 

— 

— 


— 

— 

2 

31 ) 

1 

0-067 


0-932 

0-903 

0-891 

0 - 8,68 

0-843 

0 - fi ().6 

0-788 

0-748 

0-728 

0-686 



.3 

0-067 

0-044 

0-932 

0-003 

0-891 

0 - 8.67 

0-842 

0-804 

0-780 

0-745 

0-726 

0-681 

2 

50 

1 

0-080 

0-066 

0-960 

0-943 


— 






— 




3 

0 - 98 {) 

0-060 

0-060 

0-943 

— 

— 



— 

— 



S 

10 

1 

0-806 

0-857 

0-772 

0-724 

l )-621 

0-570 

0-458 

0-409 

0 - 304 * 

0 - 263 * 

0 - 174 * 

0 - 145 * 



3 

0-806 

0-857 

0-771 

0-723 

()- ni 9 

0-567 

0 - 4,64 

0-405 

0-292 

0-251 

0-140 

0-118 

5 

20 

1 

0 - 0,70 

n -031 

0-801 

0-865 

0-814 

0-784 

()-724 


0-624 

0-590 

0 -, 62 l 

0-488 



3 

0 - 0,60 

0-931 

O-HOO 

o-aon 

0-814 

()-784 

0-723 

0 - 61 K ) 

0-624 

0-690 

0-620 


n 

30 

1 

0-067 

o - flr>fi 

0-028 

0-911 

0-877 

0 - H 5 (! 

0-815 

0-792 

0-745 

0-719 

0-668 

0-041 



3 

( 1-967 

0-955 

0-928 

0-911 

0-877 

0 - 8,60 

0-815 

0-791 

0 - 74.6 

0 - 71 !) 

0-668 

0-041 

in 

30 

1 

0-971 

0-963 


0-920 

0-878 

0-864 

0-813 

0-707 







3 

(I-!)?! 

0-963 

0-931 

0-920 

0-878 

0-804 

0-813 

0-707 

«... 




25 

30 

1 

0-975 

0-970 

0-936 







... 





3 

0 - 07.6 

0-970 


0-929 

.. 

...... 


* 







Method 1. and rflj obtained from the first two moments of 
Method 3, m, and Wj given by the empirical relations 
m, = - 1) (90-30? + !»?’), fflj = ()'2r)(fc - 1) ?(?+ 1). 


from the use of valueH of and calculated from the moments with sufRcient 
accuracy for practical pur[)aHe.s. Even when « is as small as 10 the only serious 
discrcfianeies, marked with an asterisk, occur when there are os matty as live or 
six variahlcH. 

The range of values of Wi and wig of the distributions considered in tlte last 
table may be found iiy consulting Table VI. Erom the practical viewiioint, it 
must be noted that when botl) and exceed 60 it becomes impossible to 
interpolate in the ^-tables without running the risk of making appreciable errors. 
In the.se oases an approximation, due to Fisher (19386, § 41), may be used. When 
z has degrees of freedom /i and /j which are both large and/= ^fxhlifi+ffj is 
their harmonic mean, the 6 % level of z is given approximately by 


_ 1-6449 
_ 2-32()3 


■()-7843 


(M) 


1-235 


\/i /. 


and tlie 1 % level by 
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A Test for the Ilonwi/emiUi 0 / Vuritum s ««./ (mnrimmH 

H. Some t’ox.sn)KKATioN uk thk cask or- 
Wo liavo HO fiu' cimHidered k crjuid .samides rarli ol W'/4‘ a, hwl Ht lh<' gs^nrsul 

ease there may iKa'Htunpli'H of HizcH «,{/. I, ‘d k). The i»in<ropriat*’ rntwim 


in this case iw 

'7 

1/ Mi >■..;! 1 i 

whore now 

! "i 

*-'«»( 1 , ^ h* •■*!> ' 

'HI - 1 

!•* lift" ^w/J 

and 

„ ‘ V • 

‘bv. „ 2 j •*(!m 
11 / 1 I 

A* - V 

1 i 


It is scon that If in the ratio uf a \vei>j;lite(l ^-eomefric mean *4 the neiierali/ed 
variances to a dclermiiinnt (»f oriler 7 in which the elenimit in ihr .^th unv and 
!tth column in the weiglitod arithmetic menu , the tveigltl 

given in each caKO being the (airrcHpomiing sample mw. When ciiHf*!*ienng single 
variate criteria, Bartlett (l!II}7)miggeHted that almllertcht willlM'-nhijuncdji' each 
variance is weighted with the number of degrecK of freedom it p« ww-f* rather I bars 
with the sample siw., and mme recent vinpubltHhed w ork ci(nhrm» onlinrtion, 
TIiuh tliero is a posHiiiility that in tlic general caw. wlicre more than one variate 
iH coiiHidered, some adjUHtment of lire weighting will yirdd a crjlerron «}m }i will 
more frequently clotec.b tiro falHchood of the hypothcaia /!„ when m fact aomr 
alternative liypothemH Ih true. 'I’hat in to Hay, tin* niodifiral tent nnglit |»rove to he 
more powerful, in the sense ofNey man & l‘enr»on tits', h»), with n'gnrd loa certain 
set of alternative hyjiotheHes. However, it iw likely tiiat Hueli a }mM}iti!-.aiii»n writ 
only have an ajipredablc effect when Home, at leaHt, of llu* wunpIcH arc 

The preceding reasoning eoncerning the Umitiug form of the dsnirdHition uf 
may be shown to apply whetlior the samiilw are of tlic wmu* or nut , Ihc 
distribution of as appro.ximabed to by meaUK of the Imnsformatitm. dt*|wndH 
only on N, k and g, aud not on the size of individual Hamph*«. ft therefore .Hicemn 
possible that with large samples, which although uncipial tlo not vary gmatiy in 
size, a reasonable approxinsation to the true levels may Im ohtuinmj Ijy ndiJR t he 
traissfomiation of equation (25). 

The accuracy of the method of aiiproxinsation may he irnjir.ivrd hy 
following a procedure similar to that used hy Bartlett {lih'il) when dtmfing with 
a single variate. In the general ease, the /tlh monseiit tff A, is given hy 


A k + i i \ 


iY{i 1 '-•ft f ' 

r"‘ 2 ' )) 

In the limit -2logeAi is distributed a« with / degrees of freedom ami mr 
purpose is to find a correcting factor Q, a function of k and r/ which only 
involves terms of order {nt\ such that -2«-Mog, has a mean value differing 


^a(Ai) 


k 

n 

i-i 


'I 

All £«1 


1 


m 
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from /, the mean value of by terms of order ('ni)~^. Thus Of must be identical 
with the sum of terms of order in the coefficient of h in the expansion, in 
powers of h, of log,, Using Stirling’s approximation we obtain 


a = 



m~i)V 


+ k + i~l N 

ifi [ 'YN iV - i + 1 - r S{N-k + 1 -'i)^ 


If now - 2(r“i loge Aj he referred to the y* distribution with / degrees of freedom, 
approximate levels for may be obtained from the corresponding levels for 
by using the relation ^ 


h = exp 


I 


2N 


.(27) 


The degree of accuracy provided by this method is indicated in Table X, which 
compares, for some typical cases, the levels given by (27) with those obtained by 
fitting a Type I curve, 

TABLE X 



</=i 

r/ = 2 

ry = ;f 

ff-.4 

y=5 

(/ so (1 

n, « 10 
(/-J... 

,.7) 

r,"' level 

■ 1 1'Vom rolaliim (27) 

O-SilO 

0.772 

0.774 

()'(i2] 

(Mi28 

(hm 

0.‘173 

(K-)()4 

().32K 

(M74 

0.200 




0.857 

0.857 

0.724 

0.72(1 

u-m 

ihm? 

().4O0 

(1.425 

0.2(i3 

(I•2«H 

(>145 

o-no 

i!; = S 

111 = 20 
{/-I,.. 

.. r>] 


()'95i) 

()‘9r>o 

O-SOl 

0-Ki)0 

(hHU 

()'813 

0-724 

(1.724 

0.t!24 

0.|i2H 

0.521 

0528 




O-Ml 

(MI31 

.. 

O’HOO 

(I'Hmi 

0.7H.i 

().783 

(l•(!!)l 

(l•(i^)2 

0.5!)() 

().51I4 

(I..188 

(1.495 


It is apparent from 1'ablcH V and X that the (U'ude approximation has been 
considerably improved by tlie use of tlic correcting factor (/. Althf)ugh examples 
in which the k samples are of iniequal sizes have not been given in Table X, it is 
probable that, if the smalle.st sample contains at least twenty individuals, the 
ap{)lication of the above method will give a fairly satisfactory result. 

Mention may be made in passing of a similar approximation which is found 
to give close agreement with the distribution of the comprehensive likelilvood- 
ratio criterion for testing one of tire hy pothese,s concerning the values of the means 
of populations referred to in the introduction to tliia paper. It is assumed that 
the hypothesis //„ regarding the equality of variances and covariances is true, 
and it is desired to test whether the corresponding means of tlie q characters are 
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the same in all k populations. In the case of Ivvn .Hamph-s (/.• ‘i). f lio to.nt is 

equivalent to use of Hotelling’s generalized T. II 

1 k 111 
iVt..u. 1 


] k n, 

where a’<, “ .r ij iL ^>(0 

A I,-. 1 i 1 


the apiiropriate comprehensive likelihood criterion was given liy Wilki- 


ill till* 


form 



The distribution of W may be e.xpressed .simjily in the following thti-e (■»«-«: 


(a) When q = 1, 

(b) When q = 2, 

(c) When ^ = 2, 


?'(»■) = ' "’(I- 

p(,dr) = 1, k- I)J ' (vfl'f " ® (J ■ V '■ 


In other cases, following the previous procedure, it may be Mhown that japprojii* 
mate levels for H'’ may be obtained by vising the nilation 


H' 



where 


G'=U 


1 { li(k~2) N(3(k-l)^^2) l-U^l 

q{k - 1 ) i.i \ N -T (i(W - 1 )*' ' W'-- ?'+ i ». 


A' I 
1 i|®i’ 


and has q{k~ 1) degrees of freedom, (Jalculations which have }«»en jwwie for 
certain examples show that this method of apjiroximation will give the levels of 
W with sufficient accuracy for most practical purposes if the ratio A' i- is not 
less than 10. 


iSoMMAIlY 

The main object of this paper has been to put into easier form for applicatitm 
the likelihood ratio criterion for testing the hypotliesis that all f lut currcsjiomling 
variances and covariances in a number, k, of multivariate r/. variable normal 
populations are identical In tlie case where the samples from muffi }Kqndation 
are of the same size (a, = n, < = 1, k) empirical formulae have been given, 
using which 6 % and 1 % probability levels for may be obtained retulily from 
the levels of R. A, Fisher’s z; alternatively the Tabl& of IM IncompMe Bt.Ut' 

When ® = 1, W is just the ratio of two sums of acjuares, familiar in the anslvais of variMice, 
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Function may be used. The conditions under wliieh these formulae may be 
safely employed are discussed, 

k 

If the samples are large and N ~ is distributed approxi- 

(1 

raately as y* with degrees of freedom / = !(/: - 1) y(y+ 1). The test is then in- 
dependent of the individual values of which may now differ, If there are many 
variables the sami)le sizes must, however, be very large for this transformation 
to be justifiable. 

A more accurate ap])roximation, of the type .suggested by M. S. Bartlett, in 
the ca.se of a single variable (g = 1), has also been con.sidered. The corrective 
term is, however, ratiier troublesome to calculate unle.as again ?^, = (< = 1 , , . k), 

In conclusion I wish to thank Prof. E. R. Pearson for much assistance in the 
preparation of this paper. 

AddUional note 

U. R. Nair has recently obtained, on a theoretical l)asis, an apiu'oximation to 
the distribution of in the ca.se of k e(|ual samples, somewhat similar to the one 
given as Method 3 on pj). oO-.')!. 

He finds that the distribution of Zj may be taken approximately as 
p(Zj) = const -I 

where l3g+28)-^(2g-5)-- 

As far as we have been able to check, this approximation and Method 3 seem to 
be adecpiate for the same values of n and ap]»ear to give levels which are sul)- 
stantially the same. 
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OBSliRVKO AND THBOHBTK'AL ItATIOK IX 
MBNDUMAS ISHKHITAXCK 


By E. ROBEH'rS. W, M. BAWSON* asi> 


MAKCSAHKT MAUUKN* 


Jhiirpmlif of UVimiiH 


Accoedtng to the present c^neeptinn of Mejuieliau iiihentaiU'C* jif!te««rr lontfoB 
on chromosomes, ami if not located on the Hiune ehromustmit* arc mdcimmicnt 
in inheritance, During the formation of rcpriHluctive ccHh the mtinher i tt I'liroiuM 
somes is reduced to one-half the uumher common to the sjamics or one Iwtif the 
number found iu the somatic cells. As a result of fcrtili/alion t he origioa! wuoltcr 
is restored. 

An allelomorphic pair of genes are genes oceujtying the same rclaiivi* loci on 
homologous ehromosonies, one of wlncli came from the father and the other 
from the mother. 'Viic memhers (alleles) of att alle!omorphi«‘ pair t^tuter usnal 
conditions so]iarate, going into different reprnduetive eells or gameiw Whtns 
located on ditferent ehromimomes the result is as many different of germ 
cells as the numlier of possible eombinations of the genes, cxeepl that mrnihers 
of an allelomorphic ])air are not found in the sante gamete. For any mnnher of 
allelomorphic jiairs the number of different kinds of gametes is 2“. in winch n in 
the number of allelomorphic pairs. In case of dominance am! rettwivcncKH tiu* 
number of visible classes or phemhyjies formed among llm oflsjuing «« the rcHuU 
of random fertilization is also 2". 

The imrpose of tins jiaper is to })ut on record the ratios obtained in «*veral 
mammalian crosses made in genetic studies in the Laboratory of Animal I bmet ics, 
College of Agriculture, University of Ulinois, durijig a jmriod of several yeam. 

Daily inspection for birth of young wtva made ami number of young nT**rdciL 
Usually two clossitlc'atkms were made, one at 12- IH days of age and the wuimd 
at the time of weaning, which in the cjwie of the tuice ami rata was at about 'in diiya. 
Rabbits were weaned at an age of about 8 weeks, ('ohnira ami colour patterim 
can be easily classilied as soon as the pelage is wtdl formed. Dark eye ami pink eve 
are distinguishable at birth and were recorded when the Ijirlh reeor«l« \Mire imwle, 
In hypotrichotic rats the hair is normal in api«arance until betwwn Ifi mid 
20 days of age when depilation begins, lirst nolicealile around tlte liead. 

In our records litters were classified as complete and incomplete. A complete 
litter is one in which ail the young were classified and an incomplete litter one 
in which one or more had died before classification. Ratios among incomplete 
litters did not differ significantly from ratios among complete littew, ami for 
this reason they are combined in the tables. 

* S'onnerly Assistant in Animal Husbandly. 
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The symbols used and the characters whicli they designate are as follows: 

Rats 

A = agouti, a = non-agouti. 

H=self colour (entirely coloured), h = hooded. 

R=dark eye, r=red eye. 

G= colour, c= albinism. 

Hy= haired, hy= hypotrichosis (hairless). 

D= intense colour, d= dilute colour. 

Mice 

A = agouti, a “ non-agouti. 

Pi = dark eye, pi=pink eye. 

Pa = dark eye, p2==pink eye (a second gene causing pink eye), 

B = black, b = brown. 

Y== yellow, y = non-yellow (black or brown). 


Rabbit.s 

A = agouti, a = non-agouti. 

E“ extension, e= non-extension (yellow). 

Por obtaining probabilities of ratios with two cla.sses the formula 

±()-()745V(?ipf/), 


in which 11 is the number of individuals, p the observed proportion of one class 
and q the proportion of the other class, was used. For ratio.s having three or more 
terms, the fit of the observed to the theoretical ratio was determined by Pearson's 


formula 




m[ is the observed number in a class, the theoretical or expected number, and 
is the sum. P is obtained from Pearson’s fabh for SliUidkiam and Bio- 
meirkms. 
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Observed and Theorelical Ratios in MmMim Inhmlamy 


1, MatingH of the typo Aa x aa. Theoretical ratio oi 1:1. 

Aa->A-i-al , 

gaineteh. 

aa~>a+a 

Aa-f-aa xygote.'j. 

TABLE I 


• — 

T 


V 

Dcviiiliuii , 

I'niliatilt.' 5 

?) i: 


Muting 

1 Hiiniimiit 

Kci'cauiv-t! 

crftir 1 

i- 


Rais 




j 

I 

2 32 


Aa X aa 0 

(iOU 

033 

28-5 

12-26 ij 


K 


06 1-5 


j 



Hhxhh 0 

fifla 

448 

52-0 i 

l«t«I J 

4-S»t 


E 

S(H) 

600 

j 

j 



Rr X rr 0 

41K 

401 

8-5 

ii-nr. 1 

0 88 


E 


409-5 


> 

1 



Ccxco 0 

ItlH 

177 

4-5 1 

(!'2tj 

0 72 


E 

nu 

172-5 





HyhyXhyhy 0 

74fl 

720 

12-5 

12-90 

0 97 


E 

732’5 

732-5 


ii 

i 



Ddxdd 0 

11(10 

570 

40-6 

ll-HO j 

3 ■92 


E 

622-5 

622-6 


1 



Mice 




i 



Aaxaa 0 

1900 

1828 

39-t) 

20-01 

1-8!) 


E 

1607 

1807 





PiPjXPiPi 0 

E 

1898 

1838 

31-0 

20-01 1 

I-W 


1807 

1867 


f 



Bbxbb 0 

1842 

1892 

26-0 

20-61 ^ 

i 1-21 


E 

1807 

1807 





PaPtXPjPz 0 

E 

374 

316 

290 

«-83 

* 3-2« 


345 

346 





Yyxyy 0 

259 

270 

5-5 

7-7U 

i 0-7 1 


B 

264-5 

264-5 





Babbits 





'j 


Ee X ee 0 

27 

24 

1-5 

2-40 

! 0-62 


E 

26-6 

25-5 





Aa X aa 0 

40 

32 

4-0 

2-84 

I 1'41 


E 

38 

30 



1 
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2. Matings of the tyj^e Aa x Aa, |)roducing two phenotypes in tlie ratio of 
3 dominants to 1 recessive. 


Aa -> A + a 
. . gametes, 

Aa->A + a 

1 AA + 2Aa + laa zygotes. 
Since A is dominant a ratio of 3 : 1 is expected. 


TABLE II 


Milting 

Doiiiinaiib 

ReceBsive 

Deviation 

Probable 

error 

DjE 

ItutH 






Aa X Aa 0 

0(J7 

185 

13-lHl 

8-03 

1-02 

E 

51)4 

198 




HhxHh 0 

522 

mM 

14-75 

8-70 

1-70 

E 

ti3(l'76 

HjB 




RrxRr 0 

38i) 


1-25 

0-02 

0-10 

E 

3H7-7r) 

m 




Gc X Cc 0 

94ti 

324 


10-47 

0-62 

]'] 

‘jr)2-r) 

317-5 




Hyhy X Hyhy 0 

1305 

436 

0-00 

12-18 

0-00 

E 

1305 

435 




Dei X Dd 0 

005 

204 

1-75 

8-33 

0-21 

E 

000-75 

202-25 




Mire. 






PjPgXPjp, 0 

471 

45i) 

141 

153 

12'W) 

7-03 

1-71 

Eabhila 





j 

Ee X Ee 0 

125 

33 

0-50 

3-40 

j 1-88 

E 

llH-5 

39-6 
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3. Matings of the type AaBbxaabb resulting in a tlieon-tirw! rat»» of 
1 : 1 ; 1 : 1. 

TABLE III 


MaliiiKH 


Mis 




I*' 




AHy i 

Ahy 

ally 

ahy 



AaHyhy X aahyhy 

0 

1H7 ; 

17.1 

lOil 

174 

Mini! 

»t:ii 


K 

ITfi-da 1 








HJIy i 

Hhy 

fitly 

hhy 



Hhllyhy X hhilyhy 

0 

till 

l«7 : 

Mfi 

I.lH 


*11*27 


E 

17|-2r. 

1 

: 



’ 



Rfty 

Rhy 

rlly ^ 

rhy 


: 

RrHyhy X rrhyhy 

0 

}-(li 1 

liVi 

M.l ; 

147 

It IHI'f 

14 S^iJi a 


E 

147 i 








; 

DHy 

Dhy 

dily 

dhy 



Ddllyhy X dllhylly 

0 

E 

ir.M 

1.7 hr, 

iim 

HI 

t.3H 

.3-241 


CcHyhy X cchyhy 


city 

Chy 

cHy 

chy 


; 

0 

III 

13 

in 

12 

M OdH 

»t •l.W* . 


E 

Els') 





1 

1 



At) 

Ad 

at) 

ad 


1 

AaDd xaudd 

0 

IHI 

KH 

1x7 

MO 

2 !tll7 

♦ t,3HS 


E 

I73'7fi 





1 



HD 

Hd 

hD 

hd 


i 

HhDd xhhdd 

0 

1«7 

1411 

1.33 

1.3.3 

12-11.37 



K 

irKI-.T 





i 



RD 

Rd 

rD 

rd 


j 

RrDd X rrdd 

0 

1H4 

HU) 

175 

101 

2-304 



E 

170 








AR 

Ai* 

aR 

ar 



AaRr x aarr 

0 

l.W 

140 

147 

143 

0-172 

mmi 


]■; 

14t|.fi 







HR 

Hr 

DR 

hr 



HhRr X hhrr 

0 

HI7 

150 

l.'M 

nil 

0543 

«» (Wl ' 


E 

I47'75 





AaHh xaahh 


AH 

Ah 

uH 

uh 


i 

j 

0 

17.'5 

1.30 

IIM 

147 

.'’*-72«i 

» 12ti i 


K 

155-5 





Mice 







i 

AaPiPj X aapjPi 

0 

E 

AP, 

(137 

938-6 

Ap, 

909 

aP, 

901 

ap, 

«U7 

0-911 

I 

0-075 1 

AaBb X aabb 

0 

E 

AB 

936 

933.5 

Ab 

971 

aB 

W 

ab 

921 

2-42g 

i| 

(Hm ^ 

BbPiPj X bbpiPj 

0 

E 

BPj 

942 

933-8 

BPy 

900 

bP, 

96(1 

pb, 

930 

1-820 

mm ] 


.-I 
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4. Matings of the type AaBb x AaBb. Expected ratio of 9 : 3 : 3 : 1. 


TABLE IV 


Mating 

Phenotypes 


P 

Rats 








AHy 

Ahy 

aHy 

ahy 



AaHyhy X AaHyhy 0 

3(W 

94 

93 

37 

1-526 

0-081 

E 

2()()'81 

i)!l4)4 

99-94 

33-31 




HHy 

Hhy 

hHy 

hhy 



HhHyhyXHhHyhy 0 

377 

135 

128 

34 

2-180 

0-539 

E 

379'12 

120-38 

120-38 

42-12 




RHy 

Rhy 

rHy 

rhy 



RrHyhy X RrHyhy 0 

00 

17 

20 

0 

1-044 

0-791 

E 

01'31 

20-44 

20-44 

0-81 




DHy 

Dhy 

dHy 

dhy 



DdHyhyXDdHyhy 0 

111 

42 

24 

14 

5-438 

0-146 

E 

1()7'40 

35-82 

35-82 

11-94 




CHy 

Chy 

CHy 

chy 



CcHyhy X CcHyhy 0 

427 

130 

140 

40 

0-232 

0-972 

E 

421'31 

140-44 

140-44 

40-81 




AD 

Ad 

aD 

ad 



AaDd X AaDd 0 

41 

10 

10 

5 

1-450 

0-697 

E 

45'5(1 

15-19 

15-19 

5-06 




RI) 

Rd 

rD 

rd 



RrDdxRrDd 0 

105 

52 

09 

18 

3-119 

0-376 

K 

ih7-k:) 

62-01 

02'(U 

20-87 
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5. Matings of the type AaBbCc x aabbce. Tlienrefiral ratin (jf 
III; 1:1: 1:1: 1:1. 


TABU': V 


Mating 




Sals 






ADHy 

ADIly 

AdH, 

AaDdHyhy X aaddhyhy 1) 

72 

71 

70 

E 





IIDHy 

HDhy 

HdHy I 

HhDdHyhy X hhddhyhy tl 

HI 

86 

70 1 

E 



5 

i 


RDHy 

RDhy 

Rdtly 

RrDdHyhy X rrddhyhy 0 

117 

78 

60 

E 

liH'25 




HRHy 

HRhy 

HrHy 

HhRrHyhyxhhrrhyhy 0 

71) 

80 

HI 

E 

7l'a25 




AHHy 

AHhy 

AhHy 

AaHhHyhy X aahhhyhy 0 

HH 

77 

(i2 

E 

7'2'2r> 




AHR 

AHr 

AhR 

AaHhRr x aahhrr 0 

H3 

83 

67 

E 

73-25 




AHD 

AHd 

AhD 

AaHhDdxaahhdd 0 

HI 

76 

66 

E 

7()'125 




ARHy 

ARhy 

ArHy 

AaRrHyhy x aarrhyhy 0 

(it) 

77 

7!) 

E 

71'625 




ARD 

ARd 

ArD 

AaRrDd x aarrdd 0 

68 

60 

78 

E 

68'875 




HRD 

HRd 

HrD 

HhRrDdxhhrrdd 0 

88 

63 

78 

E 

ea'25 



Mice 





ADB 

ADb 

AdB 

AaDdBbxaaddbb 0 

476 

462 

460 

E 

466'76 




I’liftiDlyiK'* 


I* ^ P 


Adhy 

aWty 

alJhy 

adify 

adhy 



62 

7(1 

MM 

tl 


m 

.1714 

^ (1 Hirt 

lldh, 

hI)H, 

hl))ly 

bdltf 

htlhj 



IB 

61 

6.3 


63 

O-Kpt 

0'2«)l : 

Rdh, 

rDIty 

rlJhy 

fdll, 

rdhf 


; 

60 

7.7 

7(1 

;"t!( 

m 

4 .3H0 

((73.1: 

Hrhy 

hRII, 

hRhy 

hrBy 

hrhy 


1 

76 

64 

67 

OK 

66 

6 071 

0 4M 

Ahhy 

aHHy 

alihy i abHy 

ahhy 



67 

73 


(H 

«7 

7 742 

1 

i 

Ahr 

aHR 

allr 

ahR 

ahf 


1 

63 

HI 

7.5 

66 

W 

0 .*.10 

((■482 1 

! 

Ahd 

aHD 

ttUd 

ahl) 

ahd 


1 

68 

89 

83 

82 

6H 

il'K 

(•■1*27 1 

.) 

Arhy 

aRHy 

«Rhy 

arBy 

arhf 


1 

65 

74 

70 

82 

(7 


IHI3 

Afd 

aRD 

aRd 

»rB 

art 



64 

79 

62 

(« 

64 

I'HIfl 


Hrd 

hRD 

hRd 

brl) 

hrd 



72 

57 

66 

17 

fS 

12212 

It (#4 1 

Adb 

aDB 

«l)b 

adB 

adb 


I 

508 

467 

414 

440 

4*27 

lo-ata 

J 



i. Matings of the type AaBbCcDd x aabbccdd. Theoretical ratio of sixteen classes in equal numbers. 

TABLE Y1 (Rats) 
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7. Matings of the ty[)(! AaBbCcDdKexaabbccddee. r«fio *if 

thirty-two classes in e([ual numl)er.s. 


TAHLIC VII (Kats) 


AaHhRrDflllyhy x aahhrrtklhyhj 

Plu'iiotyiH'a : 

(1 

K 

; 

AHRDH, 1 

17 

mm ^ 

AHRDhy 

23 


AHRliHy 

20 

! 

AIIrDH, 

24 


AhRDHy 

12 


AHRdhy 

Ifi 


AHrdHy 

23 


AHrDhy 

17 


AhrDHy 

20 


AhRDhy 

1(1 


AhRdHy 

1(1 

I 

AHrdhy 

1(1 


AhrdHy 

(1 

.1 

1 

AhRdhy 

14 

j 

AhrDhy 

13 


Ahrdhy 

IH 

i 

'1 

aHRDHy 

23 

;! 

aHRDhy 

24 

{ 

aHRdHy 

14 

' 

allrDHy 

Ifi 

i! 

ahRDHy 

13 

>1 

aHRdhy 

15 


aHrdHy 

14 


aHrDhy 

22 


ahrDHy 

15 


ahRDhy 

15 


ahRdHy 

17 

i ^ 

aHrdhy 

10 


ahrdHy 

13 


ahrDhy 

17 


ahRdhy 

Ifi 

i 1; 

ahrdhy 

17 

j 


25 m 

P 


IN® 

L 
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8. When albinism is involved, as in a mating of heterozygous coloured self 
(CcHhxCcHh), any animal which is albino (cc) will not show self or hooded 
though genetically present. The same is true for agouti and non-agouti. The 
theoretical ratio is 9 coloured self : 3 coloured hooded ; 4 albino, 


TABLE Vni (Rats) 


Mating 

Phenotypes 

a:® 

F 



GH 

Ch 

c 



CcHhxCcHh 

0 

368 

127 

162 




B 


121-31 

181-75 





CA 

Ca 

c 



CcAa X CcAa 

0 

302 

123 

102 

Q-034 



E 

303-94 

121-31 

161*75 




9. Matings of rfits heterozygous for colour, agouti, and self (CcAaHhx 
CcAaHh) will give a theoretical ratio of 27 coloured, agouti, self ; 9 coloured, 
agouti, hooded ; 1) coloured, non-agouti, self : 3 coloured, non-agouti, hooded : 16 
albino, 

TABLE IX 


Mating 

Phenotypes 


P 



GAH 

CAh 

CaH 

Cah 

c 



CcAaHh X CcAaHh 

0 

278 

95 

79 

39 

161 

4-263 

0-376 


E 

275-()() 

91-09 

91-89 

30-50 

163 




10. The theoretical ratio among progeny from matings of rats heterozygous 
for colour, agouti, and hair (CcAaH,hy x CcAaH,hy) or heterozygous for colour, 
self and hair (CcHhHyhy x CcHhHyhy) is 27 ; 9 ; 9 : 3 ; 12 : 4, 


'rABLE X 


Mating 

PhenotyjKJB 


P 



GAHy 

CAhy 

CaHy 

Gahy 

CHy 

Chy 



CcAaHyhy X CcAaHyhy 

0 

262 

78 


29 

104 

38 

0-867 

KBiHlI 


E 


81-60 



108-76 

36-26 





GHHy 

GHhy 

GhHy 

Chhy 

cHy 

chy 



GcHhHyhyxGcHhHyhy 

0 

263 

93 


26 

114 

43 


0-710 


E 

270-42 




120-19 
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11. A mating of heterozygous coloured, agouti, self, haired rats 
(CcAaHhH,h,xCcAaHhH,hy) would give a theoretical ralitt of HI rohmwi. 
agouti, self, haired : 27 coloured, agouti, self, hairleas ; 27 cdoured. uou agouti. 
self, haired : 27 coloured, agouti, hooded, haired : h colourtHl, agouti, htMKled, 
hairless: 9 coloured, non-agouti, self. hairles.s: » coloured, mm-agouti. hooded, 
haired: 3 coloured, non-agouti, hooded, liairleas : 4H albino, haired; Ih albitio, 
hairless. 

TABLK XI 

GcA.aHhHyhyx CcAaKhtirhy 

K 

wa-fi 
01-2 
{ii-2 
fil-2 
2U-I 

IVH 
irtfi-K 
3(1’2 

7'IU2 
tr674 

Among the sixty -five ratios given in Tables l-XI, five dejifirt siguificantly 
from the theoretical expectation. DjE for tlie ratio (jhtained from Hh k hh is 
4'90, for Ddxdd, 3-92 and for PjPaxpjPj, 3-28 (Table 1). For Hh H^hyX 
hhhyhy, P is 0'027, and for Hh Dd x hhdd, P is (hOur) (Table HI). Four of these 
five ratios involve the genes for hooded and dilution. Wherever these genes appear 
in backcrosses, a deficiency in the phenotypes .showing these genes always omirs. 

Among thirteen crosses giving a theoretical ratio of 1 : 1 , ten have the rerrwi ve 
class smaller than the expected, but only three signilicantly ho. 'Fhe wnn of all 
these monohybrid backcrosses is 9588 dominants : 9153 nswssivw IV, E =’ 4-71. 
For the crosses of the type Aa x Aa and which give a theoretiwi! ratio tjf 3 : 1 , a 
total of 6070 dominants: 1677 recessives was obtained. luthwawio/tiA’ a «1'4«17. 
which is a very close fit. 

For the ratios with four classes expected in equal number, P »s ii-bUTR, and 
for the crosses giving a theoretical ratio of 9 : 3 ; 3 ; 1, P * For ratiog 

with eight equal classes, P = 0*0014, and for sixteen classes, P f)'f)278. 

When the differences between the observed and theoretical classes are all 
(or nearly all) in the same direction, the larger the number of such divergent 
cases which are added together the greater will be the aignifioance of the deprture 
from the theoretical expectation. 


Plionofcyp('.a 

0 

CAHHj 

181t 

CAHh, 

(W 

CAhH, 

03 

CaHHy 

m 

CAhhy 

18 

CaHhy 

24 

CahHy 

211 

Cahhy 

5 

CHy 

104 

Chy 

38 
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NOTE ON THE PRECEDING ANALYSIS OF 
MENDELIAN SEGREGATIONS 

By J. B. S. HALDANE, E.R.S. 


The data summarized in Table VII are, I think, unique. However the autliors’ 
analysis of them can, I believe, be slightly improved. There are thirty -two 
classes whose expectation is equal, giving tliirty-one degrees of freedom. Now 
each of these can be specified, and the appropriate value of calculated, This 
is done in my Table I. The first five degrees of freedom are the segregations for 
single pairs of genes. Thus there were 277 A and 274 a rats. Hence 


551 


The next ton correspond to potential linkages. Thus the degree denoted as AH 
is due to the dichotomy of the total into 2H1 AH and ah rats, and 270 Ah 
and aH, giving y- = ()-2196. Ten more degrees are obtained by con- 
sidering the a.s.sociations of gene pairs three at a time. Thus there were 
270 (AHR-i- Ahr-i-aHr-t-ahR) rats and 2S1 (aHR+ AhR-i- AHr-f-ahr), 
giving x^ = 0-210(i. This degree is denoted by AHR. Eivo degrees are obtained 
by considering four pairs at a time. Thus the degree AHRD is given by 
274 (AHRD-fAHrd-pAhRd-l-AhrD-i-aHRd + aHrD-f-ahRD-pahrd) and 
277 (aHRD-i-AhRD-i-AHrD-i- AHRd + Ahrd-i-aHrd-i ahRd + ahrD) rats, 
giving x^ = 0•()103. Finally the degree AHRDHy is given by 

278 (AHRDHy + AHRdhy -f- AHrDhy -I- . . . + Ahrdhy + . . .) 
and 273 (aHRDHy +... + ahrDHy + + ahrdhy), giving y® = (K)454. 

In each ca.se the principle is the same. We consider the genes 2, 3, 4 or 6 at a 
time, and divide the rats into two groups, one containing an even number of 
dominant genes, the other an odd. 

Biologically the first five degrees of freedom represent differences of viability 
between dominants and recessives. 

The next ten could reiu'csent linkages. However, as there is good reason to 
think that the five genes comicrned are in different chromosomes, deviations 
greater than (ioukl be acicounted for by sampling would probably 1)0 due to the 
fact that the eftoct.s of the different genes on viability were not additive. For 
example, Robert, s, Dawson and Madden’s Taljlo I leaves little doubt that h and 
d have an adverse effect on viability, presumably before birth. We further notice 
that the degree of freedom HD has a large, though not significant, y®, due to an 
excess of HD and hd rats. The actual numbers are 168 HD, 137 Hd, 126 hD, 
1 2 1 hd . If the viabilities were as 1 : 1 - a : 1 - /? : 1 - a - /i we should expect, on 
the basis of the single-factor segregations, to find 161'26 HD, U3'75 Hd, 
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TABLK I 


Analysk of X^ 

liobi rlfs el ah 

Table VII 

.. 

Degree of freedom 

Ditfereru-e 

X 

t j 

P 

A 

+ ,3 

+ 11-1278 

00163 

«'9I( 1 

/V 

H 

+ ,')» 

+ -i-SlSB 

6-3172 i 

0-012 - 

R 

+ 5 

+11-213(1 

0-04, 54 1 

0H3 ,! 

D 

+ 3f) 

+ 1-41)11 

2-2232 \ 

O-M 1 

Hy 

AH 

- .“i 

-(1-213(1 

004.54 

0-43 ’ 

+ 11 

+ 0-4(18(1 

0'2!M 1 

((■64 

AR 

-11 

-0-4(18(1 

0-2HW 1 

((■64 ' 

AD 

-13 

-()■.'■),')38 

O-v'8847 1 

0-58 1 

AHy 

+ 27 

•1 1-1502 

1-.315K 

(('2.5 1 

HR 

4- 1 

4 ll-(M2(i 

O'OOIH 

(«-',(7 

HD 

+27 

+ l-l-'»()2 

1-3158 

0-2.5 

HHy 

+ 7 

+ 0-21(82 

0-0889 

0 77 

RD 

- 3 

-((-1278 

0(1103 

0-18( 

RHy 

+ !> 

+ 0-2130 

0-0454 

0-83 

DHy 

- B 

-0-213(1 

O-rH.51 

0'.»3 

AHR 

-11 

-0-468(1 

0-219(1 

064 

AHD 

-25 

- 1-(HI.')0 

1-1.343 

((-2« 

AHHy 

+ 1U 

+0-8(H)4 

0-6552 

042 

ARD 

-23 

-0-9708 

0-9419 

0-33 

ARHy 

-31 

-1-3206 

1-7459 

O-ISI 

ADHy 

- 0 

-0-3834 

0-1470 

0-To 

HRD 

37 

+ 1-5762 

2-4664 

((•12 1 

HRHy 

-15 

+ ()-621K) 

0-4083 

0.53 j 

HDHy 

-13 

-0-5538 

O-.KMl? 

0 58 

RDHy 

-31 

- 1-3206 

1-7459 

0-19 

AHRD 

- 3 

-0-1278 

(MU 63 

o-»o 

ARDHy 

-31 

-1-3206 

1-74.59 

0-19 

AHDHy 

-13 

-0-5538 

(l-,3067 

1 0-.5H 

AHRHy 

-23 

-0-9798 

0-9419 

0-33 

HRDHy 

+ 17 

+0-7242 

0-6245 

1 0-47 i 

AHRDHy 

+ 5 

+0-2130 

([■0454 

1 0-83 i 

1 1 

Total 31 

- 7 

-0-2130 

25-639 

I 0-76 j 


131‘76 hD, 1 14-25 hd, which would give ™ *• foi* degree of fwithuu HD. 
If the viabilities were as 1 ; 1 -- a : which i« pcrhttjw « more 

plausible hypothesis, wo should expect 1(12-11) HD, 143-Kl Hd. 13bHl hD. 
114-19 hd, giving x^ = 0-001, So the two hypotheses are indwliugiiwhuble 
except in enormous samples. 

The remaining degrees of freedom represent similar biological ptwiWlitiea 
as to the non-additive character of the effects of various genes on differential 
viability, Fisher (1926), who was the first to point out the method here empdoyed 
for the analysis of x^, states that these degrees have “no simple biological mean- 
ing”, However, the positive value for the degree HRD covild be mainly due, for 
example, to the fact that hRD rats, of which there were only 58 as against an 
expectation of 68-876, are more inviable than was to be exjrected from the pITeeta 
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of the genes one at a time, which would give an expectation of 66. Such inter- 
actions have, of course, been observed by Gonsalez (1923), Timofeeff-Ressovsky 
(1934) and others. 

In my Table I the values of P in the fifth column are read off from those of y 
in the third by means of a table of the probability integral, except the last entry, 
which is calculated by Wilson and Hilferty’s (1931) theorem. There are no 
auspiciously high values of P, and only one (for the degree H) which is significantly 
low. Actually a glance at Roberts, Dawson and Madden’s Table I shows that had 
their Table VII been based on a larger sample the values for H and D would 
almost certainly have been significantly low. 

In the earlier tables some rats are included which do not figure in Tables VII 
and XL Hence for a full discussion of the data each table would have to be 
analysed separately. 

We can now criticize Roberts, Dawson and Madden’s final analysis of the 
data. They point out that the total y* for segregations involving eight equal 
classes, i.e. three genes at a time, is unduly high. We can see why this is so. 
>Su])poHing we extracted from ’fable VII the data on segregations of three genes 
at a time we should obtain ten tables with seventy degrees of freedom. If these 
were analysed into their convijonents we should find that we had counted the ten 
degrees of which AHR is tyj)ical once each, those of which AH is tyi)ical three 
times each, and those of which A is typical (i.e. single gene-pair segregations) 
six times each. We have thus given a quite undue weight to those degrees of 
freedom which actually show the greatest deviations. Actually each deter- 
mination of y^ by the authors involves one degree of freedom not dealt with in the 
earlier tables, while the remainder refer to segregations already considered, and 
often on larger samples, in their earlier tables. 

Thus, to take an example, the rat mating AaHhDd x aahhdd of Table V 
has a y^ of 1 1 - 31 for seven degrees of freedom. But the only new information in 
’fable V relates to the difference between AHD + Ahd ■+• aHd + ahD and 
aHD-i-AhD-i-AHd-f-ahd, giving ^ ~ I -941 for one degree of freedom. All the 
other information is already given in ’fable HI, along with some more, for 
oxam])le, concerning rats segregating for A and D, hut not for H. 'fhe correct 
method of collating the data is therefore to add up the values y* for the last 
degree of freedom in each case. 

'fhe expression for the last degree of freedom in a Pj mating such as 

AaDdxAaDd can easily be shown to he . . . . ^ , This 

expression has = I®, and a nearly normal distribution. Bimilar expressions 
can be written down for larger Pj’s. Where the mating involves epistacy the 
formulae for the last degree of freedom are similar. For example, the first mating 
of Table VIII involves two degrees of freedom, the degree C for the segregation 
of C and c rats and the degree H(C) for the segregation of H and h among G rats. 
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TABLK n 

Valwn fsinijh' dfijirrn of frre>hm. nit thOi 


Animal 

Degree of fo'ctloin 1 

j 


Rat 

A 

2-4.'il! 

n 

H 

10-bIti 


R 

(i-ssa 


G 

U'l!ll 


Hy 

U-4^27 

It 

D 


Mouse 

A 

l-Kftl 

M 

P, 

I(I2I 

tl 

B 

(M17II 

,, 

P* 

tmn 

n 

Y 

(1032 

Rabbit 

E 

11-177 

»* 

A 

n-KKii 

Rat 

AA 

1-13« 

«1 

HH 

1-3(17 

ti 

RR 

((■((10 

♦ 1 

CG 

((-177 

it 

HyHy 

((-(KHJ 

H 

»D 

0-02(( 

Mouse 

P»Pi 

1-255 

Rabbit 

EE 

1-420 

1 factor 

21 

3M(12 

Rat 

AHy 

0-G29 

M 

HHy 

0-027 

t> 

RHy 

(1-061 

M 

DHy 

0-323 

n 

CHy 

0-296 

M 

AD 

0-175 

i1 

HD 

2-032 

it 

RD 

1-471 

M 

AR 

O-OOl 


HR 

0-387 

>» 

AH 

((•778 

Mouse 

AP, 

4-262 

M 

AB 

0-130 

»» 

BPi 

0-130 , 

Rat 

A*Hy, 

1-368 ■ 

it 

HjHyj 

1-852 

it 

RjHyj 

0-081 

»l 

DjHya 

0-88C 

(y 

GjHy, 

0-025 

a 

A,D,, 

0-657 

)} 

R,D, 

0-012 


Aiuiiml i 

tl*-|itrr r'll frci'ditiis 

2 laC(«irK 

21 

m 572 

‘ 

Rat 
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This would be affected by linkage. Its y* 


4(CH-3Ch)'* 


Similarly, for Table 


!)(GH + Ch + c)’ 

IX the last degree of freedom involving the association of A and H among G 

4(CAH - 3GAH - 3GaH + OGah)^ 


animals is 




27(GAH + GAh + CaH + Gah + c) ‘ 

The last component degrees of the sixty-five .segregations of Tables I-XI 
are given in my Table II. Whereas Roberts, Dawson and Madden, for the reason 
given above, find a significantly large total I do not. The probability of 
y 2 _ 75.44 foj. sixty-five degrees of freedom is P = 0 d 8 . The whole excess is due 
to the single-factor segregations, and of these, one give.s a significantly large y*, 
while another probably does so. Of the forty-four degrees of freedom involving 
jnore than one factor the large.st is 4- 252, and one value as large as this is to be 
expected. The order of the flegrees is taken from Roberts, Dawson and Madden, 
If we include the eight cases involving ei)istacy witli the others involving the 
same number of factors (gene pairs) we find the results summarized in my 
Table III. 

'FABLE III 


Summary of y® values 


Numbin' of factors 

Number of degrers 

Total y* 

segregating 

of freedom 

1 

21 

31'912 

2 

23 

16'»42 

;! 

14 

20'r)36 

4 

6 

5'!)13 

5 

1 

0-04.') 

Total 

65 



It will be .seen that, except for the single-factor ratios, there is no evidence of 
deviation from Mendelian exi^octations, and on the whole the lit is very much 
better than would appear from the analysis of Roberts, Dawson and Madden, 
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THE USE OF STATISTICAf, METHOi)S IN THE INVESTI- 
GATION OF PROBLEMS OF GLASSIFICATION IN 
ANTHROPOLOGY 


PART I. THE GENERAL NATURE OR THE MATERIAL ANII THK KlIKM 
OF INIBARACIAL DISTRI BOTH INS OF METRICAL IHAllACrKRS 


By G. M. MdRANT 


Every naturaliat who Imfi liwl the misforlutH' to iini!<-rt«l«' tlie thwrtji»i«»o of » uf 
highly varying orgiiniHina, hiw onfimiuterptl etiwn {1 Hiwik iiiVr jitn !*!..’!)- Iik** 

tliat of man; ami if of a (iiiutimw iliHiKmiUon, Im will wnl hy imiiintf nl! tin* 
graduate into each other, under a singli' H[Hfeic«; for hf will nay to InoMf iSwt h<’ ls« Oo 
right to give names to objects whieh lie cannot defims 

(’HARttlH IUHWIX, Tbf «/ .VflO. IH"1. 


1. iMTRODtftTIOK 

The main aim of physical aiithrojxilogy Ih to iiitravct the wiurw t4‘ hiiiitH}} 
evolution, and it may be taken for granted to-day that the prttjHr 4miy tif the 
natural history of man is concerned essentially witli thtJ itnalc anti the ptttls tjf 
his descent. The general method employed is to eomiava* tiie physical charaPtem 
of suitably chosen groups of individuals, and to discover the interrelaiitJtJshijw 
of these groups from an interpretation of the differtmres found between tlieni. 
Difficulties are encountered at the outset owing to the nature ttf the wntinnuni 
of which the component parts have to be compared. Tire greu{m w hich are most 
suitable for the purpose in view can easily be reoognired in a geiwrai way. Init, it 
is difficult to define them with precision. There is a lamentable lack of i^mmmnt 
among anthropologists to-day regarding the way in which {mpuhaimns auitalde 
for the purpose of investigating mass descent can best he tIiserimhuUwl. Ibirw in 
stressed the difficulty of the problem, and it would l>e idle to !io}» that «»y 
simple solution of it has been overlooked. 

This paper provides a discussion of the statistical approwli to atilhri*i«o 
logical taxonomy. The general thesis maintained in it is that the comparatively 
new method can be used in a systematic way to discriraiimte suitable grim|», to 
reveal their interrelationships, and hence to disclose the eouw of racial hiitory 
in so far as adequate evidence is available. 

It is now more than forty years since Karl Pearson first applied to raeial 
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material the statistical methods which he established and greatly extended.* 
Recognition of the value of his procedure was slowly won, but the methods are 
almost universally accepted to-day. It may even be said now that the advantages 
of their use in physical anthropology are generally taken for granted. There ia no 
longer a need to emphasize the value of measurements, or to point out repeatedly 
the futility of attempting to derive any useful conclusions from evidence as 
scanty as that on which many earlier theories were based. 

Karl Pearson’s teacliing in tliis field has been most widely accepted in so far 
as methods of reducing group data are concerned, but it is one thing to describe 
and another to interpret in anthropological terms the situation observed. With 
regard to such interpretation, he indicated the lines along which he expected that 
progress would be most profitable, but he did not codify a system or lay down 
any rigid rules for the guidance of those who wished to follow him. His ideas 
with regard to this matter could only be grasped by observation of the ways in 
which he treated particular sets of material. Meanwhile, different anthropo- 
logists were drawing deducitions from statistical evideime in a variety of ways, 
some of which are entirely at variance with what came to he known as “ bio- 
metric” practice, and many of which are irreconcilable inter se. 'il'lio position 
became chaotic, and it is still in this state. 

The view advocated in this paper is that the way.s in whicli statistical 
methods may be used to supply valid and useful anthropological conclusions can 
only be determined from wide application of these metliods to suitable material. 
No a jiriori considerations are likely to be of much lielp bore ; the nature of the 
situation has to be examined thoroughly before it is possible to decide on the 
best ways f)f treating it to give results of the kind recpiired. The sliort lilstory of 
the subject bears clear witness to the fact tliat the empirical test is always the 
crucial one. The nature of anthropological gi'oiips has now been sufficiently 
explored, and certain methods of interpretation Imvc now been sufficiently ap- 
plied, to make possible a just Msessment of the value. of a particular procedure 
duo primarily to Karl Pearson. He repeatedly stressed the need for adapting 
statistical theory to practice, and we may follow him in deciding that the value 
of any new doaoriptive methods, or modes of interpretation, suggested must he 
judged from a sufficiently wide api)!ication of them. 

The statistical nature of at\throp(dogical groups will be discussed first, and 
certain simple but important generalizations which are often ignored by 

* The first genenil treatment of the topic was given in a paper by Cicely I). Fawcett and others 
(1902) which Karl Pearson edited and arranged. He liad previously applied statistieal methods to 
anthroiiologieal material in The Ohmm of Death and other ISludm in Kvolulim (1897) and in ft 
series of papers in the Procseiinge and Transactions of the Royal Society of London, Quetelet 
(chiefly in 1896 and 1871), Galton (in various publications), Stieda (1883) and Witt (1879) hod 
previously used statistical methods in discussing anthropological problems, but they wore either 
not conoerned witli problems of racial differentiation or else they considered such problems only 
in a cursory way. 
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anthropologista can be formulated at this stage. In the second jilacf*. different 
methods of reduction will be, considered, and. lastly, the antbro|tologieal com 
elusions which may legitimately be derived from the data reduced in that way 
will be discussed. In this Pai't I of the whole paiwr topics discussed cuin cm the 
selection of anthropological Kamplo.s (§ 2). general coiiKideratirms icgarding the 
treatment of the samples cho.scn (§ 3), and the lornw ol intraracial distrilmtioim 

(§4). 

2. THK .selection of ANTHHOI'ULOOK'AI. SAMFLES 

Anthropology has liecn defined a« the study of groups, and this ddinithm i,s 
appreciated at once by the .statistician, .since his methods are essenf ially flcsigncd 
for the treatment of group data, in a consideration tif antliropologiiml nmferia! 
the nomenclature of the statistieiau may he used with advantage ; the nalnns oi 
the processes of description and analysis is thereby nuule clearer and curtain 
ambiguous biological terms are avoided, 

A fo-inilation is defined to he any assemblage of individuals considcnnl and 
treated as a single group. It may be large or small, and there may be liltle or 
much justification for treating it as a single group. 'I’lie term is a general, but. 
unambiguous, one which can be conveniently uswl in practice. A nomiJir is 
mMe np by a number of individuals selected from a population, and tint select itm 
is said to be random, when it is believed that the population m a whole is fairly 
represented, so that there is no bias favouring any sjHicial sect ion of it. A samiile 
of individuals may be said to form a aeries, and theae twi> terms can (»ftcn he 
interchanged without loss of perspicuity. 

The general method of the physical anthropologist in dealing witlj new material 
is : (a) to select a sample at random from a particular populat ion ol' a suitable 
kind, (6) to describe the characters of the individuals comprising the sample, 
(c) to infer from these observations, with greater or leas accuracy according to 
the size of the sample, certain characteristics of the population aamplcd, p/j to 
make comparisons between tlie evidence so obtained and that available fur 
other populations described in the same way, ami (c) to tkaiure from thm? 
comparisons the biological relationships of the new population. It itt tmcwtuiry 
to make clear distinctions between these successive procemMw, and failun^ to <lo 
so has sometimes led to confusion. 

The description in statistical terms of the first process, i.e, that of sampling, 
is perfectly precise providing that there is some means of ensuring that the 
samples are taken at random, but in applying the process to his material the 
anthropologist is faced with another difficulty. How is he to distinguish the 
populations from which suitable samples may be taken? In general very little 
consideration has been given to this question. In comparisons mark between 
samples selected in different ways and representing different kinds of populations, 
differences between the kinds of sources from which they were derived have 
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generally been ignored. But in fact such differences are of vital importance, a!nd 
half the difficulty of the whole process of the statistical analysis of anthropo- 
logical materia] is overcome if an effective way of dealing with the problem of 
selecting suitable samples can be devised. The contention made here is that 
owing to diversity in the modes of selection of the samples commonly used by 
anthropologists, there is often little justification for applying statistical methods 
to them in a rigid way. 

For the anthropologist the ideal population would be one made up by a 
number of individuals having a common descent. In the most favourable 
circumstances such communities cannot be distinguished at all exactly, however, 
and in the initial stage of his enquiry the anthropologist is generally quite unable 
to delimit with any approach to precision groups defined in such an abstract 
way. In practice far less stringent conditions have to be accepted. It may be 
said that a community made up by a number of individuals whose ancestors — or 
the majority of them, at least — are believed to have intermarried lor a consider- 
able number of generations will be a suitable one to accept as a unit group from 
which a sample may be taken. Such groups have to be chosen as (sarefully as 
possible, having regard to any relevant evidence available. But populations of 
very different sizes will satisfy the condition stated. I'he total population of a 
province and also a small parochial community forming a special part of this 
total might be considered, and it is possible that these are differeirt in nature and 
that different conclusions would be derived from them. A further condition, 
then, which should be satisfied wherever possible, is tliat the population con- 
sidered is a large one of the regional rather than of tlie parochial kind . Experience , 
discussed below, has shown that large communities are always more suitable for 
consideration than small ones when the purpose in view is the analysis of more 
remote origins. But if the only sample which can be obtained does represent a 
population of the parochial kind — as may he the case if a series of skeletons from 
a single cemetery is the only material available— -then it should be fully recognized 
that tliis may have been a special part of a larger po[)ulation which might more 
profitably bo considered as a unit group, and allowance should be made for the 
peculiarity of the source. 

Having selected a number of samples from what appear to be suitable 
populations, the anthropologist may compare their statistical features, such os 
the forms of distributions, measures of variation and correlations they supply. 
If the majority of the samples are found to possess certain particular cha- 
racteristics, while departures from the rule are occasionally met, then it may be 
legitimate to conclude that the peculiarity of the exceptional cases is due to the 
fact that the populations they represent are unsuitable for the purpose in view. 
The empirical investigation of samples known to represent unsuitable popula- 
tions can aid examination of the matter. The practical procedure which appears 
to be most profitable is this: samples believed to be of the right kind are selected 
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and then certain teste, derived from experience ot tlie HtatiHlical viniiracteristicK 
customarily exhibited by such samples, are applied in order that almt^rmal smes 
may be revealed and rejected as unsuitable for further use. 

The initial process of selecting samples must be eonsidered rather more fully. 
In practice it may be advisable k) take into account information (»f Heverul 
kinds. The most important factor is usually geograplneal ptmition. iml tlm. 
separation of series representing different social cdiMt.ses may bo doHirablo, and 
archaeological, historical or linguistic evidence may suggest that the parti- 
tioning should be carried out in tv particular way. In dealing with skeletal 
material the time factor may be an important one to e(}nsidc?r; it may bo well in 
treat a pagaii and a Oliristian Anglo-iSaxon scries separately, for exatnjile. and 
not to assume that the two represent indiKtinguishahle populatioim. 'Fhe general 
rule is that the total aeries for which data are available should he split up in a 
natural way into as many suhseries representing diHiinet jHipulntionH of a 
considerable size as poHail)le, providing that each mdiseries mdected in large 
enough to give comparisons of value. 'I’he way in which this can iH’st be done 
depends on circumstances wliicb arc fUiculiar to each wd of materifd. dlic 
anthropologist is thus obliged to plan his survey of a particular rt'gion with 
evidence of several different kinds in view. The majority of thene tlo mil ndate to 
physical characters, but he has a perfect right to (smsider. as ndevant to his 
biological enquiry, any evidence which gives some indication of the Huhgrutijw of 
the total population which are such that intermarriage imnnally takes {»}act% or 
took place, within rather than between them, 

The procedure described is necessarily rather vague and of ati nrlutrary 
nature: any subdivision adopted initially is experimental. In practice «’hoice is 
largely controlled by the amount of information availaltle, Suppwe. for 
example, that the total aamide for which measurements are available rcjmewnts 
people coming from all parts of a particular country. If therts are enough 
individuals they may be divided into subsamples on a regional basis, or the 
population of each region may be thought of as siibdivided into county groupH, 
say, and each of these may be further subdivided into parochial tiomimmltics, 
which may bo split up into small groups {)f closely interrelatwl luMiplc, A 
hierarchy of groups within groups can thus be imagined, but itt prircticc it w ill 
not he possible to carry the process of subdiviskm beyond a certain »tage. as 
further dissection would lead to subsamples too small to yield ctuuiiftrliKuta of 
value. There can be no assurance that the divisions based on gdograplucal or 
other considerations will be the most effective for the purpose in view, but aome 
such divisions have to be adopted in order to disclose the nature of tiio total 
population. 

At the outset of his survey of new material relating to' the impulatlon of a 
particular region, the anthropologist thus requires a knowledge of the size of the 
smallest series which can be used profitably for the purpose of the claaaifieation 
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in view. 'Hiis minimum reqiiirement cannot be determined from any a priori 
considerations : the point has to be settled after an experience has been gained 
from the comparison of series of different sizes. The concluBions reached in this 
way are discussed below. 

However ample his material is, the anthropologist engaged in investigating 
remoter origins and relationships should choose his unit groups, whenever 
possible, in such a way that they relate to populations of a considerable size. 
Ifamily and parochial groups should certainly be considered too small. The need 
for restricting comparisons to larger groups tlian these if possible is evident on 
account of both a 'irriori and a imteriori considerations, but it is frequently 
overlooked.* It should be anticipated that series representing small com- 
munities are likely to be biased, not random, 8ample.s of regional populations. 
7’he existence of local variants of a widespread population derived from a single 
source has to be recognized, and hence special precautions have to bo taken in 
drawing deductions from samples representing local groups. In conformity with 
these expectations, it is commonly found in treating a total group of the 
national kind that sample.s representing small subgroups of it tend to differ 
more than samples representing large subgroups. It is customary to find in such 
a case that up to a certain point there is greater uniformity in tho statistical 
attributes of the samples— -quite apart from iiuctuations due to random 
sampling — according as the group.s represented by the samples become larger. 
Greater diversity has to be expected when smaller subgroups are compared. If 
the anthropologist concerned with the broader taxonomic problems of his 
subject treats grou])s of the family or parochial kind, ho v'ill thus bo in danger 
of losing his way among the tree.s when he should bo taking a bkd’s-oye view of 
the wood as a whole. 

It is unlikely that an examjjle of the statistuial conception of a perfectly 
homogeneous population is ever encountered in dealing with anthropological 
material. The kind of continuum which has to be analysed by making comparisons 
between its component parts is of a peculiar nature. It would be an advantage 
if the subsections of it considered for the purpose of racial classification wore 
always popidations of a considorablo size, and fijr many sots of anthropomotrio 

* The writer hw! ovorkiokwl it at ono time, and ho wuh atroixliiigly wlnwnishal in a letter from 
Karl Poarflon dated 17 July 1924, in whioh tho following jtnaHago onourB: 

“May I give you an analogy? In a littlo Yotkahiro village thons are two manors— two small 
squires, little hotter than yeomen— and they kept on intermarrying their family mambera. About 
one-third of the churchyard contains, for at least two centuries, tho gravre of these folk. You could 
got at least 30, and possibly 100, crania from that third of tho ehurohyard which would differ very 
sensibly from the skulls in the remainder of tho yard. It would arise solely from the foot that we aro 
dealing with an inbred population, which possessed certain characters which raised it above, or at 
least differentiated it from, the remaining population. I don't suppose there was any racial 
difference between these little squires and their neighlmurs, they were all ultimately of Danish 
descent. But simply one or two interbreeding families were burie<l in one earth. Now my analogy 
has for its bearing the danger of picking out 20 crania and saying these differ from the rest.’’ 
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data this oouditioti is satisfied. Not infrequently, however, and |tartii'niarlv iti 
the ca,se of exc-avated skeletal niatorial, the only fiainples avaihilde for a |«*nieular 
regional group are of the i)ar(,)ehial kind. 

The Anglo-iSaxon skeletons jweservcd in nuiwums aetnally t oiut* iron? a 
number of scattered cemeteries, and there is no long series from asingle oemeti*r,v. 
The fact that thoy were disperstsd should he eonsidtm'tl a real advatifage u hen 
the object in view is to determine the characters tsf Anglo Saxons in general, or 
of the larger subsections of this total population. If all the KjM'cimens Iiad cojne 
from a single cemetery it would have been diffi(mit not to .HWiiim* that flu* w<rie,s, 
particularly if it were a large one, could he acccpterl fauk ilt imnix as 
repre.senting the total Anglo-Saxon population, and to draw conclusions Iroin a 
comparative study on thi.s ttssumption. In fact there i,s good rea-son to .hHS|H't>t 
that the physical characters of the. {Kmple forming any .‘^nial! Anglo Saxon 
community differed appreciably in their averages from tlnwe itf smy large .section 
of the total population. 

In practice it is scddom possihle, to cxerciw^ niueh choice in wleeting aanqdeH. 
la many iastances the only material available representing a past pujoihition of 
the national kind consists of one or more sericH (d' skelefoiis tr<»m one or two 
cemeteries. The antliropologist cannot ho exitoctod to neglect all scri<*.s w Itit-li are 
not ideally suitable, but he should romejnher that sevi'rni he is ohliged to uw* are 
not ideal for the purpose in view. Allowuneo has to be tjunie for known dilferejires 
between the samples used dependent on the diverse wavs in which they were 
selected. 


3. ThK SfaTISTtOAX. TIVKATMEnT OK ANT'JlROt'lU.onn'At. SXMJ'l.KH 

In treating his problems uf clasBvfktation the }U\1 hniiHiUigist Ktnrts hy 
selecting series of individuals representing different populafiouH whieh a}q«'»r to 
be more or le.ss suitable for the purpose. Data have betm recorded for large 
numbers of such samples relating to living imojiles in different parta of the world 
and to extinct peoples represented by sories of skeletons. It is «n'ogni? 4 ’d that 
these samples are taken Irom groups of different kinds, and that allmvam-e will 
have to be made for this fact in comparing thorn. Apart from iht« divi^rBity 
which is appreciated, there is no guarantee that all the Bamphm arc suit able ftir 
the purpose of examining group relationships. iS<unc may bo entirely unaiiilahlc 
because they represent populations of exceptional kinds.’aiid their {»cwli«rity in 
origin has to be detected by examinatiojr of tlie Hampltw them.Hidvos. Dor 
example, a certain number may be found to have iwmdiar characteristicH which 
suggest that they should not be used, while the majority conform to a particular 
type. In practice it is necessary to appeal to e.Kperieiuse derived fewn the trimt- 
raent of a large number of series in order to justify the use of certain t«ts 
employed to disoriminate between suitable and unsnitahle material 

A question which might be asked is: What is the nature of anthroi>oicigical 
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aampleR selected in the way described? — ^what features are common to all of 
them and in what ways do they differ? The statistician — who is accustomed to 
thinking of problems in terms of populations and samples — points out at once 
that, though this question may be of the right kind, yet it is not put in the right 
way. He reminds the anthropologist that his ultimate concern is with popula- 
tions, and that the samples are really of interest only in so far as they supply 
information regarding populations. A moment’s consideration shows that this 
is, or should he, the view of the anthropologist. If he collects information 
relating to 1 00 Greenland Eskimos, say, it is only in order to arrive at generaliza- 
tions regarding either all Greenland Eskimos, or else' some section of them which 
is still a population many times larger than the sample observed. Hence the 
question should he: What generalizations regarding populations can be deduced 
from samples selected in the way described? — what features are common to all 
the populations and in what ways do they differ? 

'file fact that his ultimate concern is with jjopulations and not with samples 
of them is not likely to be entirely overlooked by the anthropologist, but there is 
a real danger that he may assume in particular instances that the characteristics 
of samples arc precisely the same as tliosc of the populations they represent. 
Statistical treatment has a great advantage in this connexion, since it keeps the 
distinction referred to continually in view and jjrovidcs a systematic and— as far 
as oircu instances permit — precise method of reaching the generalizations required. 

In this paper the only kind of information regarding samples which will bo 
considered is that provided by measurements, whether of series of living people 
or skeletons. The general proldem can bo viewed in the same way if non- 
metrical cliaraeters are dealt with, but a different statistical treatment is re- 
quired for them. All the individuals referred to will be supposed adult, and for 
such the measurements can be supposed, in general, to be unaffected by the age 
of the individual, d'he data for males and females have to be considered sepa- 
rately, but the generalizations reached are the same for the two sexes. Data 
relating to numerous series arc available for a iiarticular sot of characters 
commonly recorded by antliropologists: nearly all of tboso measurements 
concern either absolute size (chords and arcs) or shape (indices and angles). In 
the ease of the skull, for example --tho data for it being more abundant than those 
for all other hones of the skeleton put together— the measurements are designed 
to give a description of tiie size and shape of the skeleton of the head considered 
as a whole and of all its principal parts. 1’ho majority of the anthropometric data 
for living people commonly recorded provide indirect measures of the size and 
proportions of different parts of the skeleton. 

With the object of obtaining information regarding the distribution of a 
particular character in the population represented, the following features of the 
distribution provided by a sample will be considered : 

(i) its form, 
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(ii) its scatter, that is tc. say the variaiioii exhihitml by tlw iTsdingH fnr the 
individuals composing the sample, 

(iii) its central tendency, that is to say the vahie ol’ the elmraeter {or mmUf) 
about which the readings can most eonveniently he (onsidererl to Jw Hratfeml. 

These three features of the (liHtril)Uti<m8 are iiret n.nsiih'red in the raiw r»f 
characters dealt with singly. It is also necessary to gain itif.mmation regartling 
the ways in which different cliaraetei-s arc jusaociatefl in indivi(l»ial». and henw 
it is necessary to examine: 

(iv) the correlations of different pairs or groups of characters in iiidividtwlK. 

A sample of the kind so far considered is made up by a miriiher of indivirlHals 

believed to belong to a population such that its mciuhcrs liavc hwm chiefly 
intermarrying with one anotlier for a inifnher fif gencrAfions. 'fhis is Oftlled an 
intrarmal sample— a rather unsatisfactory term to use at this stage simits the 
concept of race has yet to he defined. An examination of the fftrnis of the 
distributions and of their variabilities in the cjise of metrical chariretera con- 
sidered singly, and for a conaidorablo number of intraracial samphw. shows that 
the only comparisons of them likely to give results of any anlliroitological interest 
must be of a certain kind defined in later seetlotiK ol iIum fwiwr, StatiRtical 
treatments of other kinds are scan to be unprofitahle. The same eviilenw rnakew 
it clear that the most appropriate measure of the, wmfml leiideney of any one of 
the distributions to use is the aritlunetic mean, or average. 

Attention is thus focused on mean mensurements, and different ways of 
treating them have to be considered. The conception is iiitnaiueed ofaamplw for 
which the units are not actual individuals-™ as for intraracial snmpkw but* 
abstract beings such that each has metrical characters eipial to the avera.g«s« for 
the particular group which it representa. This idea of Vhommt. nmym i« not new 
in anthropology: it was clearly enunciated more than UK) yeam tigo by Quetelet 
(1836). An assemblage made up by a number of homnm nmjifm repreaenting 
different populations is called an interracial sample, and in the caao of a particular 
character the mean measurements for a number of samples reprtwfiiting different 
populations will provide an interracial distribution. 

In order to gain a more oomploto knowledge of the Ktatistiwit rialwre of 
anthropological samples, required for the purpose of determining the ways in 
which they can best be treated in practice, it is necessary to tAko cogtdrAnoc of 
other features of them in addition to those listed above. Intenracial rotisWera 
tions are involved in examining: 

(v) the ways in which the averages for difforent charaeterH distinguish 
certain seta of populations, 

(vi) the forms and variabilities of interracial distributions, and 

(vii) the interracial correlations of characters. 

In the course of the investigation of these topics it is powibie to obtain 
estimates of the minimum sizes of samples which are retpured, under different 
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conditions, in order to provide intergi’oup comparisons of value. One of tlie 
advantages of tlie statistical approach to the problem of classification is that it 
makes it possible to determine with some precision the least amount of evidence 
which must be available in order to yield useful conclusious. The history of 
physical anthropology bears clear witness to the fact that some control of this 
kind is a vital need. 

In the following section of this paper the forms that occur in practice of the 
intraracial distributions of anthropometric characters are discussed. It is hoped 
that this will be followed in later parts by discus, sions of the other topics listed 
above — viz. other intraracial characteristic.s of distributions and charactei’i8tic.s 
of interracial distributions — and, finally, by a discussion of methods of comparing 
samples. 

4. The forms of intrabaciae dlstriiuitions 

It can be stated categorically that the distributions of measurements for the 
vast majority of samples that occur in anthropological practice tend to conform 
closely to the normal curve. Quotelet's suggestion (1871) that this is so has been 
confirmed by data relating to numerous sorics of living people and skeletons from 
all parts of the world. In general, the samples from a particular population tend 
to give a olosor and closer ap])roximation either to tlie normal or to a very similar 
form of continuous curve according M their sizes arc increased. Hence it is safe 
to infer that the characteristic type is unimodal and symmobricjal. This generali- 
zation applies to all absolute metisuroments and, in spite of a theoretical cpiali- 
fication mentioned below, also to all measurements of shape (indices and angles) 
for which the matter has been adequately investigated. 

There has been considerable confusion regarding interpretation of the forms 
of frequency distributions in the discussions of this toj)ic provided by some 
anthropologists to whom statistical conceptions were unfamiliar. In the early 
days of biometry the fallacy of drawing conclusions of certain kinds from 
peculiarities of the curves provided by samples was ro])eatedly stressed, but this 
warning is still occasionally ignored. The chief errors made with regard to the 
matter have been in supposing (a) that a small sample— made up by fewer than 
100 individuals, say—is capable of giving an adequate estimate of the form of 
the distribution in the populivtion sampled, and (fc) that mere inspection of the 
diagram for a sample is sudicient to reveal the information roepured. In this 
connexion numerous examples may be found of the fallacy of supposing that the 
features of a sample are precisely the same as those of the population from 
which the sample was taken. The general appreciation by anthropologists of the 
perfectly clear implications of a simple sampling experiment such as the one to 
be described would still save much fruitless discussion. 

The head lengths and breadths of 1314 soldiers who were natives of Lanark- 
shire have been given by Tocher (H)24). The order in which the first 700 men are 
arranged in the table appears to have been entirely random as far as the 
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measurements are oonceraed * The ccplmlie imUren were rahmlaUnl fnr thin 
sample and Fig. 1 A shows the distributions ol.toined l.y takitsg mnw^wivr groups 
of 60 from the first (nos. 1-50) to the fourtmith (ims. (151 Eiu lj of 

these subsamples provides an estimate of llie diatrihtdion ot the I'ejihiihr index 
for Lanarkshire men in general, ami any one of these suhsafiifihw nnght have 
been the only one available relating to tin; Uda! poptihifioit, 'fhe distrilmti^uw 
in Fig. 1 A clearly exhibit a great diversity of forms. 'I’hey are- all alike in shf»wit)g 
a majority of individuate witli indices ))etwoen 75 and KtJ. and in eitlier ritvrring 
a continuous range or showing a few outlying vahtes dctacbed frr.in the main 
body. Otherwise, there is little agreement betwemr them. H<nue slutw a nmtr or 
less gradual rise to a maximum frcciuency and then gradual fall jkk X|, 
while others appear to have two (nos. XII and XIll) tir more fm«. Ill ami XI) 
distinct peaks. The fact that inconstetoncies are fonml in these! rf«jKwf.s makt<« it 
obvious that the series are too small to provide any reliable informuf ion (hk to the 
existence of such features in the distribution for tlie iKijiulation they a!J 
represent. 

Fig. iB shows the same 700 measurements trtmted in surcesHtve suHsaniph-s 
of 100, so that the first series (F) is the sum of the first two (I and II) in Fig 1 A, 
and 80 on. The difltributions for those larger wwies show more vmiformily the 
maximum frequency being for the range 77 7H in four t’a.H{‘s out t»f (Im seven, for 
example— but they still differ very appreciably among llumiwlvw in aome 
respects, and one (VF) appears to bo clearly of a bimodal form. It is evident 
that the measurements of more than 100 individuals are retjuired to give any 
reliable indication of the details of the distribution of the cejdudic index in tho 
parent population. 

The top and the middle distributions in Fig. I F relate t.o the samplw made up by 
the first 360 (I") and the second 360 (IF') Lanarkshiro men, rosiKsettvely. tiolh 
distributions appear to bo of symmetrical form, but the first shows an out' 
standing maximum frequency while the second is "flat-lopimF’ (pklyknrlir). 
The conclusion must be that it is unsafe to draw deductirmH regarding »omn 
features of the distribution for tho parent population by mendy insjuspliiig the 
distributions qf samples, even if they are as largo m tlu»e twq, TTto boltmn 
diagram in Fig. 1 C relates to tho total sample of 700 men ooiisideml. It apjwars 
to be slightly asymmetrical, but comparison with the tsorrospanding areas tjf the 


After some number a little above 700 the order in which the Indivkhwte w itr«nK»«l k dvwfv 
not random, They were evidently taken in auooosaivo small groujw *uch {|»i wch gnmp had a 
restricted range of head lengths, end ooaseqiuently the dblrihutiOM of the tsplrtlits irttfcx for 
samples of to (nos. 701-60, and so on) 'are very peculiar. Dr Toeher luw kiodiy arwwtir«cl mv 
en^umes mth regard to this matter„W he informs me that the deirartoro from rMwIfinnwn w«* 
not appmoi^ when the memrements wore taken. It may have b«n dm to the feel tbit it one 
nTf privates measured in hatches having hate of the muu* m», in the 

belief that to would he^p the anthropologists! Departures from a nmdom wlartkin which may 
aSeot the distributions of obarooters are easily overlooked. 
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normal curve fitted to it sliows tliiii the divergeiireH of tite liinf i jjir.uii *. f f ; uu f iu'ir 
theoretical valucH on the hypothe.sm of normality are all relative Ut flu* 

of these frociueiicies. 

This example is sufliciciit to (lemotmtrate that it is entirely nn>»ale tu iittach 
any significance to the peculiarities of the diKtrilmt hms for miw!! HantpleK iwy 

made up by fewer than 20i) indivi<Iualn, say and that mere of ihe 

forms of distributions for sampleH which would usually le* rmif.iden‘4 ail(*«jnati; 
for most statistical purposes is liable to misleiwl. The anthrojadogiHf win* *hnl«f»eK 
theories of racial mixture or relationship from the '* jK’aks of small di'^triluttnuiH 
is deceiving himself by Imilding on a statistical foimrlalion wdneh k uiHvmnd 
owing to the ina(ie(|uac,y of the evitlcm-e. He is trying to get itinro mbumafion 
from his material than it can poHsihly provide with any muranee *<» r<trrei » new. 
and forgetting that a sample only provides estimates of the featmei. j»t the 
population repreHented. 

It is found in anthropological practice lliat tiistribuliuns for hiuhII ?*isJJt|dert 
frequently appear to bo markedly skew, or to he Id or iiodtimodal hut that 
those for largo samples scarcely over exhibit any of thc«‘ pm-uhiirtncf*, 't'he 
writer is unable to give any examples of a distrilmthm, for the kind of Kiunple 
now considered, which relates to more tluin !hi(i individnid!* and who h Sad** !ti 
show a close approach to the form of a unimoilal and symmetrii-itl curve, with 
the exception of two discussed in the apiamdix ladovv. The uceurit'iuT of mure 
than one mode which appears to he definitely outstantUng, or of any apprertalde 
degree of skewness, is apparently never oxhildtud by the disfribtjttmw of 
characters for large samples of the kind enoountored iii anthrcqsdogienl pravtire, 
The conclusion must bo that wlicn such laumliaritics are hmnd lor the dixtribn 
tions of small samples they merely domonstrato that Kinall KiunplcH art< Hubjeej to 
large “errors” of random sampling. The “jnsaks” ran have no anthr»(jw»i«igi«-ftf 
or genetical signifioanco. Their unstable nature can often !«* dcfHons1r<tf.i'»i by 
merely grouping the individual measurements in a liitTcrcnt way. a?* thia tnay 
change the appearance of the distributkm to an apitteciabie extent . 

If it can be demonstrated adequately that tlm distributions wltirh uivujr in 
practice almost invariably indicate that the iwinudnl tlistributums mmit be 
closely similar in form to the normal curve, then this is rdiviuusly a rojjrbtHton 
of great anthropological importance, since it impliias that iSuf kind of stutn 4 icrtl 
analysis likely to bo at all profitable is severely restricted mvjug tu ihr mitim* td' 
the material. Evidonbe of a more adequate statistical kind rogitnling the ri,rm« 
of distributions will now be considered. 

Values are given below of probabilities (y*, P test) obtainwl by *-oini»iring 
various distributions with the theoretical frequencies obtaimxl by fitting niirmal 
curves to them. Those for the cephalic indices of series of Lanarksliire mwi hr 
samples of 100 or more (see Pig. 1, B and C) are in Table I. Tim U*we«t P found is 
0-03, indicating that for samples of the me (U)0) actually .Iriuvn at random 
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from a normally flintributed j)opulation, 1 in H3 would be expected to give a less 
good corrcHponderice with the theoretical (Irntribution t])aii that exhibited by the 
sample in (luestion. All the other P’s are much higher, indicating that the 
estimates providetl by the samples bear a inudi closer resemblance to the normal 
form in these cases. 

TABLE I 

Probahilitks {x\ P test) that the distrilmticm of the ce/phalic index for different 
sulmmples of 700 Lanarkshire men indicate, that the character was normally 
distributed in the jmrent population 



Si:ct;(;.ssive sainplc.s of 100 iiuliviiliinls* 

Ist 

2ti(l 

llnl 

4tli 

litii 

(.111 

7 th 

p 

0-(l2 

O'lW 

O'SO 

(1-HH 

i 

0'7!l 

O'OIJ 

0-2H 



Kirot IWO 
iiulivklualsf 

iSorond IthO 
iuitiviclual.st 

790 i)uiivi(iii/ilH:|: 

P 

0'77 

0-(!4 

0'70 


* Distnlmtiim in 9 ffroups, f OiHtributinii in II! Kmiijm. J Dintribiitidii in ITi (^runps. 

Table II gives the P’s for distributions of the (icphalic index in the ca-se of 
If) series of male skulls of the kinds which normally o(!cur in anthropological 
jffactice, each series being made up by more than 100 si)ecimonB, 'I'licse were 
.selected merely because they happened to be the ones of the size recpiired most 
easily accessible to the writer. The first I i are believed to he homogeneous in 
the sen.so that any subseries of them likely to be (the, sen for anthropological 
jmrposes would not show signilieant dilTerenees in any sl atisiical constants. All 
but one of the P’s for these an; high <mough to provide excellent Hup[)ort for the 
hypothesis that the (aijdialic! index was normally distrilmted in the parent 
population, The single (ixception is for an Egyptian sericK (11) which is known 
from other evidence to have stati.sticad features which arc typical for " homo- 
geneous” sami)lcs, so the kiw P found for the cephalic index may well bo 
attributed to the vagaries of chance selection. 

.Series 12 and 13 are known to be heterogeneous to some extent, .since the 
means of the cephalic index and some other characters for their subseries show 
differences wdiich are significant though small, 'fhe P’s for these two are both 
high, Series 14 and I-") are known to represent populations which are unsuitable 
for the purpose of investigating group relationships, since they are exceptionally 
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variable. This is clearly indicated by the steiiclard cleviatinnK fnr the eejihalie 
index given in the right-hand column of Table II, and alw) l»y ilmm Irir wrlain 


TABLE n 

Probabilities (y^ P test) that the iHstrUnUms of rtplmlk indkrs prondrd hj 
various series of male skulls indkate that the charaHer um tmmttlUj tlktnhnM 
in. the parent pojmlalions 


No. of 
series 

Series* | 

i 

No. of j 
skulls i 

J 

Mean 

No. (if 

gPlIIJW 

Pt 

'i 


(n) Scenes liolievwl to Ik* htimt<gwif*«ni» 


) 

1 

Guanohe 

245 

76*0 

11 

O-WH 

■I'lki * ii-io ; 

2 

Eskimo (St Jjawrenee Is.) 

156 

77*1 

10 

(rtsH 

2-r«+* tM4 ; 

3 

Egyptian (26th- -SOth dyn,! Gir.eh) 

866 

75lt 

16 

<H»n 

■2f4 » »tt>8 - 

4 

Now British 

114 

72*2 

11 

0-7M 

2 74 * W-lM 1 

6 

English (Camh. dissecting rooms) 

IIH 

758 

11 

0-157 

'i»3-( !» Itt , 

6 

Esfimo (Greenland) 

160 

71*3 

13 

ttotw 

3-0*1 -|tt-15 I 

7 

Czech 

m 

82» 

U 

(1470 

3 17 t- if- 22 

8 

English (17th cent.! Wliitoihaftel) 

131 

74*3 

i 

0 »65 

.3'28 ! if an 

9 

German (RoihcngrillKtr) 

220 

73-8 

13 

0-878 

3-35 * »f 16 

10 

English (17th cent.: Farringdcm St.) 

135 

75-4 

11 

«'137 

3-48 ♦■lf-2! 

11 

Egyptian (18th”2l8t dyn.: ThelK's) 

187 

75'1! 

13 

0-283 

3-55. -j O'ltt 


(i) Series showing small but signifieont difl'eremw 

fotwwn (VimiKmcni subsrr.wi ! 

12 

Egyptian (prcdynastic! Nnqnrla) 

166 

72*7 

13 

»-66fl 


13 

Swiss (Valais) 

453 

«3'9 

19 

0-423 

4-01 1 0-13 !j 


(e) Hoterogweous scries 


j 

1 

U 

French (mediaeval and modem) 

1 1000 

79*6 

22 

1 0-0C«) 

4*32:!; (+10 

16 

English (Bronze Ago) 

t ISl 

78*8 

12 

1 0-860 

5-«iO'3l 1 


(d) Artificial mixture of two wri«i 


1 

2+12 

Eskimo + Egyptian 

321 

74-8 

L.'* 

I 0-M7 

i 

t 3-43 iO*l4 i 




* The series are the ones described in the follovring sounw. which give the indivkittal 
ments in the majority of cases! 


(I) Guanohe— -Hooton (192fi). 

(3) Egyptian— Pearson & Davin (1924.), 

(6) English— Duckworth (1917). 

(7) Czeoh-Sohiff (1912). 

(9) German (pooled)— Morant (1929). 

(II) Egyptian— Schmidt (1886), 

(13) Swiss-Pittard (1909-10). 


(2) Bskiinw.- llrcHieka, (IflMj. 

(4) New British -Mbllpf |1W}. 

(6) Eskimo- -Fowl k Itumimfl fWlS), 

(8) EnKHah~ Ma«hmdl fUkW). 

(10) English— Hwb |lff3i), 

(12) Egyptian— Fswwtt flKl2), 

(14) Prcnch—Tojiiitt-Rl (IUSS. p. 38a). 


(15) English (pooled)— Morant (1926). 
t The P’s were found from Table XU in Tahkafor Slalislkmrui and Bimmirielam, Part, I, lbs «' 
used being the number of groups, less 2 (i.e. degrees of freedom were telwt « numter of gnmp, 

1688 tijt 

t 0'00003. 


Other characters. One of these heterogeneous series (14) gives a P ao low as 
entirely to disallow for practical purposes the hypotliMw that the iiitiix waa 
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normally distributed in the parent population, but the P for the other (16) is 
higher than several of those in section (a) of the table. The test thus fails entirely 
to distinguish between all suitable samples, on the one hand, and all unsuitable 
ones, on the other. Another example of its failure in this respect is recorded in 
section (d) of the table, the data there relating to an artificial mixture of two 
series having the very considerable difference of 4-4 units in their mean cephalic 
indices. The P of 0-3 obtained gives no indication of the strange origin of this 
sample. 

It is easier to -point to series which are clearly unsuitable for purposes of 
classification than to any which would confidently be expected to be entirely 
satisfactory. A distribution which appears as likely as any to reveal the un- 
satisfactory nature of a series of the former kind has been provided by Herskovits 
(1930, p. 161). It relates to the thickness of the lips of 969 male adult “Negroes” 
in the United States. While a certain proportion of these men are believed to be 
of pure African origin, the majority represent varying grades of miscegenation of 
Negroes and members of other ethnic groups, with European ancestors pre- 
dominating but some admixture of American Indian blood, The character is 
obviously .one which makes a marked distinction between the types of the two 
major groups of parental populations involved. The distribution for it gives a 
P of 0-636 (13 groups), Examples of the same kind relating to samples which 
are obviously unsuitable for purposes of group classification— such as mixed 
“white” samples in the United States, or artificial mixtures of series which 
differ markedly in some respects — might be multiplied indefinitely, and it is 
quite usual in examining such material to find distributions which give high P’s 
when fitted with normal curves. Trevor (1938) has observed this situation for 
all the characters he examined when dealing with data for living populations 
in different parts of the world derived from the crossing of European and non- 
European peoples. 

It is actually found that the vast majority of the distributions of measure- 
ments that occur in anthropological practice satisfy a tost which shows the 
hypothesis that the character was normally distributed in the parent population 
to be not at all unacceptable. The high values of P usually found when compari- 
sons are made with normal curves fitted to the data do not demonstrate that the 
characteristic type of distribution for all but exceptional, and presumably 
unsuitable, populations is absolutely normal, but they are quite good enough 
evidence to suggest that the population distributions must be closely similar to 
uniraodal and symmetrical forms. 

This generalization applies almost equally well, as far as is known, to all 
measurements, whether of shape or size, of series of living people or skeletons. 
Furthermore, it apparently applies equally well to series from groups of any size 
which are likely to be considered as populations by the anthropologist. Of the 
series of skulls dealt with in section (a) of Table II, some came from single 
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cemeteries while others were made uii hy cumhitting fipm-HorJis !>»»• «ovrr»! 
scattered comctericH. Waiiiifles from small communities of the pmuAnd knid 
usually provide distributions which approxijiuite t-o the form ol tho iiMriiia! 
curve neither more nor less (doscly thtui do samples ol the sariir sii'.v wdrrtvd at 
random either from the popiilat ion.s of proviiieeH or from f Hum* of v»,nntrm*^.. I'hc 
form of the (listril)utionK of metrical characters fails entirely to dillcrrntmfc 
populations of very different sizes. 

The fact that a close approach to nornmiily is almost invariwldy fmmd 
suggests that this condition can safely be acccided bh one which Junst he 
satis/ied by the distributions provided by samidw heldre they can Is* ttcerpted 
as suitable for group comparisons, It him been however, that the dmlri- 
butions of samples whicli are known to be imHuitalde for pnr|K*sc« of rLiwiiira- 
tion may also satisfy the condition. Hence the teat must be supjmscd a ncciwiiry 
but by no means a crucial one. In the case of a particular wnnplc the theoretical 
test might be that all characters should show reastmahly high prolndiilitics that 
the distrihutions in the population were normal. In pracliw, of ♦otirw, the 
question can never be e.\’aminc(i for more than a small number of charactem 
very seldom greater than nO -hut it is important to appmeiate that the test 
should be applied, in theory at least, to all characlem for which tlatii arc avail' 
able, The demonstration that a single distribution imlicatw a <'hw approach to 
normality is no evidence that any other distribution for the same sample will do 
so. 

In general, useful anthropological results of a statiKtical naturt^ can only bn 
derived from the conjoint investigation of data for several eharat'ters. Uilfcreiit 
characters are quite likely to suggest cliflercnt conclusions, ami the kimi of 
information required has to bo obtained by taking the evidence of «. sufficient 
number into consideration. To take a hyjKitliotical o-xamplc, it may i» assuincd 
that in the case of the crossing of two iiopulations (/! and H) the one derived from 
them [0) will only ho likely to show any ileparture from the nmial form of 
distribution-such as bimodality —-in the ease of those chariMsters which sliow a 
clear difference between the avesrages for A and H. But cxpcriimcc slitnvK tliat 
the two parental groups are likely to show clear difTmutetw in ihwr avewgcp for 
only a small proportion of the eharactem comparetl, even if .-{ and if belong to 
quite distinct ethnic groups. Hence peeiiHarity in the dwtrilnitions for the 
hybrid population would he expected only in the case of a small pro|iorli»n of 
the characters examined, and it is necessary to examine m many ii« |«i*«iible 
in order to ensure that such a peculiarity has not been overlooked. Furthermore, 
it should be appreciated that the chance of departures irom normality Ixiitig 
found is very different for different characters. This point is dBcmwd below. 

A test which might be applied in practice could specify that the diatrilmtioiw 
of all metrical characters recorded for a particular sample examined should give 
P’s, when compared with normal curves fitted to them, greater than mime 
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arbitrarily chosen value — 0>001, say. If one or more of the distributions were 
found to give values less than the limit then the sample might be considered so 
peculiar that it should not bo used for purposes of classification. It is known, 
however, that many unsuitable samples would pass such a test, and hence it is 
of little practical value. Its value is lessened further by the fact that when a 
clear departure from normality is found the peculiarity of the distribution can 
nearly always be detected more easily on account of its abnormally large 
variation: this is so for aeries 14 in Table II. 

Much labour has been expended in making a detailed analysis of the distri- 
butions of characters provided by anthropological samples which are adequate 
in length for statistical purposes. Significant but slight degrees of skewness have 
sometimes been recorded. It has not been shown, however, that such examples 
provide any conclusions which are of the kind needed by anthropologists. Usually 
there is no guarantee that the samples were chosen absolutely at random, and 
.the slightly peculiar but erratic characteristics sometimes observed may well 
be due to biased selection. 

It has been shown that if tlie components of an index {AIB) are normally 
distributed, then the distribution of the index conforms more closely to a 
Pearson Type IV than to the normal curve. This topic has been discussed by 
Merrill (1928) and Fieller (1032). A Type IV curve is unimodal and may be 
almost symmetrical and very similar in form to the normal. In fitting normal 
curves to a series of distributions it is generally not possible to observe any 
tendency for indices to give lower values of P than absolute measurements. 
Elderton & Woo (1932) made a detailed study of the forms of the distributions 
for a single long series of Egyptian crania in the case of an unusual set of 
measurements relating to cranial bones considered singly. They concluded: 
“that the distributions of characters measured on the individual bones of the 
skull are not of normal type, but rather that the skewness and kurtosia of such 
distributions are peculiar to the individual measurement.” This is a point of 
considerable theoretical interest, though all the departures from normality 
observed were in fact slight. The conclusion that the population distributions 
almost invariably bear a close similarity to the form of the normal curve in the 
case of all kinds of moasuroments appears to be sudiciontly exact for almost all 
practical purposes, 

This is a result which anthropologists are very loath to accept in spite of the 
clear evidence of a large amount of material which supports it. The belief that 
markedly skew and hi- or multimodal distributions must be found was very 
generally held thirty years ago, and it has not been abandoned yet. The fact that 
expectations of this kind have been categorically denied to him is a matter of 
prime importance to the anthropologist, as it suggests that certain methods of 
analysis are quite impracticable on any valid statistical lines. He still lives in 
hope of discovering a method of dissecting a sample in such a way os to disclose 
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the sources from which it wtw derived. Judging Itti'gelv <*»* i»n‘%’ttlrw r. „r 
unimodal and Byininetrical distributions which offer no hcojkj for tlim'r Hm!!. but 
also to some extent from other chiiracforistim <ff samiil<'« to te conwrhwcr!, Sfie 
statistician says that it is idle to entertain such a ho}K* in dmling willt mctririii 
data. He suggests that Ktatistiv.al te.stH may prolitaldy be u«cd to tlit*tingnif«ii 
suitable from unsuitable sampleH, Imt that those elected for fiirttn'r ««• m«st 
be kept intact and somehow be compared with one another iw entifir*« in order 
that conclusions may be deduced regarding the relfttionHlii|W sutd dcKH'n? of the 
populations represented. 

The difference between the two points of view ia funrlamenlfiL Rcjsdilion of 
tlie assertion that the vast majority of the sfUnples with whic!* t!my deal in 
practice do not indicate either appreciable skewiujas nr more than turn mmle in 
the populations represented lias hwl little effect iti modifying the methtsis of 
anthropologists in general, or even of thoHO who commonly tiw atiitwticat 
methods. It is possible to go further, however, and to offer an ex}dan»tinn of 
what appears at first to he a surprising uniffmmity in one characterisf to of diverse 
kinds of material. 

It is an observed fact that the variabilities of all Ham|»iea whirdt can lie 
accepted for purposes of classification are very corwiileralile and that they t<*m{ 
to bo similar in the case of all populations for which tuloijnate data aw available*, 
the earliest of these having existed in Kgypt about fSHHl Absolntr! etpndity 
in the populations is not indicated, but intraracial variabilities all lie w itlun a 
fairly restricted range in the cose of any particular eharacter. Kvidenre in 
support of this statement will be provided in the follcnving msetion of tins pft|w»r. 

In the case of a mixture of two populations, a distribution would f>nly Im 
expected to be bimodal if the difforenoe of the means for the two fmpulationK 
happened to be appreciable compared with the two intra>gnm|:! variftbilitft*«. Ati 
astimate of the chance that the difference in question is likely to }» apprtudable 
will be given by the ratio of interracial to intraracial variabilitiw, and iJiw is 
likely to he different for different characters. It has usually Imen iwwunwi that 
the differences between the ty|>e8 of populations definal l>y avnrsgo nioaaure - 
ments are often large compared with the differenons Imtwwui inflhddualii 
belonging to the same population, but though this situation njay l» triift ftm 
skin colour it is not observed in the case of any metrical chawvter hr winch 
data are available. With the evidence of skin ctJour in view, anthitqmlagkte havn 
always been inclined to over-estimate the interracial variabilil? ««i under- 
estimate the intraracial variability of other ch&raetem. The evidentjci of moasure* 
ments corrects this tendency and shows that for metrical ebametera the wt iialion 
is markedly different from that which has often been ptwtulated. 

Examples relating to particular characters will now be oonsidornd. Fig. 2 
shows the distributions of stature for three series of men. The ilmt mMm to a 
tribe of Congo pygmies (Ef4) recorded by gebesta & Ubwlter {i»33): it haa a 
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mean of 1430 mm., which is very close to the smallest given for any people in the 
world, and a standard deviation of 514. The third relates to the Dinka tribe of 
Nilotic negroes, from unpublished data of 
A. MacTier Pirrie,* audits mean of 1804mm. 
appears to be the largest on record, the 
standard deviation being 7()'7. The second 
distribution relates to men from a province 
of Japan recorded byMatsumura (1 925), and 
it has a mean of 1622 mm., which is almost 
midway between those for the extreme 
series, and a standard deviation of 53-1. 

There is seen to be an absolute separation 
between the ranges of the statures for the 
pygmies and the Dinkas, the tallest member 
of the former group being shorter than the 
shortest of the latter, though it is probable 
that the distributions for larger samples 
would overlap to some extent. A mixed 
series made up by taking 100 pygmies and 
100 Dinkas would obviously bo bimodal. 

A distribution which would clearly be bi- 
modal wouldalso be providedif the.fapaneso 
men and the pygmies, or the Japanese men 
and the Dinkas, were takpn together. If 
mixtures were made up by taking pairs 
of series at random from all populations 
in the world, however, the chance of getting a particular pair with means 
differing by as much as 182 mm. ( = Dinka-Japanese mean) is quite small. 

This point is illustrated by the interracial distribution given at the bottom of 
Pig, 2, which was compiled l)y taking all tlie means for series of 30 or more men 
oolleotetl by Beniker (1026) and representing populations in all parts of the 
world. The mean of the distribution is 1646 mm. and the standard deviation 58'8. 
It can be shown that if pairs of values (i.e. means for series) are drawn at random 
from it, then the probability that they will differ by leas than 80 mm. is 0-5B, by 
less than 120 mm. 0-80, by leas than 160 mm. 0d)l, and by less than 200 mm, 
0-96. It is also known from experience that a mixed distribution of statures mode 
up by combining two samples of equal sizes from populations in which the 
character is normally distributed, and which differ in their means hy 80 mm, is 
quite likely to give a P on comparison with a normal curve fitted to it which 
would he high enough to make the incorrect hypothesis thus tested acceptable if 

* I am indebted to Dr Otto Samson, wlio is working on the material, for the statures of the 
Dh(ki»8, The records were kindly lent by Prof. J, C, Brash. 
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Stature in cm. 

Fig. 2. Intraraoial distributions of statures 
nipresonting an extremely short, an 
intermediate and an extremely tall 
group, and an intorraoial distribution. 
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the source of the material were unktunvu. Witli any upprecialtly w hlcr it in 

of the means for the two component Kericu all P's ftmiul wo\it«l i»r*»}mhly !«« hi* 
low as to indicate, clear doparturo from normality. 'I’lm ptwitioti nmy Ut< Hunimw! 
up roughly in this way in the caae of Ktature: if each of a numhor of hamplt'-s of 
unknown origin was actually a mixture t»f twai Hampk-H of tupml hir.cH roprc«m{ ing 
pairs of populations chosen at random from all in the worhl then thrjmm ,f Ihr 
(Usinlmtion done, in the case of aiKUit Imlfof them, woultl not ho wiflii iimtly 
peculiar to reveal clearly their distim-tion from samples rcfm-wiiting 
populations, but in tlm case of tlic other half jxauiUarily in oiigirt Iw 

indicated. In the same circumKtancc.s disfrilnUhins uldeh wmm definitely 
himodal would bo expected to occur oc<'a.Hionally. 

Actually, however, such forms arc scarcely ever funmi in praftim*. .iml 
hence it must be inferred that the comlition ]ioHf nlaf etl is of an at tificiai kind. 
In fact the anthropologist is not at all likely to climme a samph* st* badly that if 
represents what may lie called a purely “meehanical" mixture oJ tuo distim t 
populations which dilfer a[)prcciahiy in their averages for tme ur mow * ha 
racters. He is far more likely to come across a sample rcprc.'iwitiug a jwivuihi? joji 
derived from the partial or complete blood admi.xtnrc of two groups uhii-h wore 
originally distinct. In such cases it is known that the mean for the mixed 
population will lie between those for the parental groups. Trevor fmiml 

that the distributions for groups derived from the eroKsing of distmrt rtlmi** 
stocks usually give high P’s when compared with normal curvea tilted to them. 
Samples occurring in anthropological practical may relate to divemw* kind** of 
conglomerations, as it were, of mechanical and bkmd infermixtun* ol populationH 
which would best bo considered singly in examining grmqi reluf ionshipK, In wieh 
cases the forms of the distributions of metrical charneterH will usually fail 
entirely to indicate the composite nature of tiie (mscmhlagc. A aample of the 
“white” population of the United .States eonsisting of reennfs wiw fouml to give 
a distribution of stature which could be adequately represented by a iiorrnul 
curve (Hoffman, U)l8). 

Anthropologists find it hard to accept the eondusion Uiat their fnH|uenev 
distributions for heterogeneous material never give any clear imlteft{i«*n of tin* 
component parts of a particular sample. This situatimi is observed kn-mw the 
interracial variation of metrical oharactera is of the same order »w, or Bmaller 
than, intraraoial variation. The situation has beon considered in the eH»f of 
stature, and for this character the interraeial dMriljutiwt givw a rfamlanl 
deviation of the same order os those normally found for intraraeial samples. 
These questions have been investigated most fully for eertain mcmnircmcnte of 
the skull which are more reliable than the vast majority of tljo mmmwifimnts 
available for series of living people. Bata similar to those illustrate! in Fig. 2 
for stature are given in Table III for three cranial charaeton}. The rmiieriid 
collectedrelates to series of adult male skulls madeup by SO or more speeimeiwiind 
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representing populations, believed to be suitable for purposes of classification, 
which have existed in various parts of the \vorld from about 5000 b.o. to modern 
times, In the case of oaclr of the three characters considered the distribution for 
the serie,s with the lowest mean is given first, followed by the distributifin of 
racial means and then that for the scries with the highest mean. 


TABLE III 

Distributions of three cranial measurements, each for an intraracial sample, with 
an extremely low mean (L) and one with an extremely high mean (//), and for 
an interracial sample (I) : mala skills 


Horizontal fircumfercnco 


Horiitninal index 



2-23 1 4'U I 4'11 


■11 15-1 


1()-1 1,t7 


7'Or. I'(i9 ,5-(i 


* Loyalty iHlaiider [iSaniHin it. Roux, lliUi '22). 
t Tc‘lm|i;hite (Riiiclier, lOlIt). 
i Baininf,' (New Britain) (Bauer, BUS). 


§ ReilicnKriiluT (Moriinl, 11)2H), 
i; TnnKaiiyilia (Rieit, lill.5), 

*' Veiu7,ui‘liiin (Miircimo, IHilO). 


'I'he situation is seen to be much the same for the cepltalic. index as for 
stature, the interracial variability being rather larger than that for nearly all 
samples which can be accepted as suitable for purposes of (dassilication. The 
distributions for tlie series with extreme means cover contiguous rfinges, and 
there is no overlap. Of all the cranial characters for wliich adeijuate data are 
available the cephalic index is quite outstanding in this respect. For mo.st of 
them the situation is very similar to that illustrated by the data in Table III for 
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the horizontal circumference of the .sknll. In this caKt.. tlic rfninlani <lfviatinii 
for the interracial (liatributkmiH apimceialdy icm tlmn any intmrackl vniitc ftnmd, 
and the extreme distributioiiH Khon' eoiifiiflt'rahhj tn'crlain The liinil in flu* ntlicf 
direction is exhibited by the index exproHwng the brciidfli of fin* btramcn 
magnum as a percentage of its Icnglli. In tliiscikxe iiilitrnicifil variation ik mndi 
smaller than intraracial variation, and the interracial disirilmtiun {'a1l,« entirely 
within the range for any intraracial sample. For most cmiiia! rfniraeterK f he 
averages for all populations in the world are quite likely to bill williin the range 
provided by any population chosen at random. 

It is evident tiiat the chaiuies that a Iieferogenemis sarnph* wdll t'xhilut a 
distribution diverging appreciably from the normal form arc very liillerenl for 
different metrical characters. As far lus is known, statuni aiirl l!u* ccplmiir' itidcx 
are the two most likely to show departures from tho rule in the mi.!*!* of mixed 
samples, since tho ratio of inter- to intraracial variation is largest for them. For 
this reason examples relating to tho two characters have been given above. 
Most cranial measuremonts are decidedly less likely to rcvml hctfrogc! icily by 
tho forms of distributions obtained for them, as the dilTcrinicc Is'twwm the 
means of any two samples inadvertently combinwl cowpanvl with the tuo itdnt- 
group variabilities will bo, on the averirge, appreciably Ic.^s fur Ihcm, t’nr simu* 
cranial characters interracial variation is so small comjiami with intranwia! 
variation that no heterogeneous samples will lit' at nil likely In providt* {fcrnlinr 
distributions. 

In view of the circumstances discussed above, it is not surprising tiiat tlie 
distributions which occur in antliropological practii'c Hcnnt'ly «*vcr »l)ow niiy 
appreciable departure from a symmetrical and unimwlal form. Delailtnl stalisti* 
cal analysis of the alight peculiarities of tlie dislrilmtions somcf imw nlmerved 
is never likely to be profitable in the case of small samplw of fovver 

than 300 individuals, say— -and in general it does not lead in tim caac of ade- 
quately large samples to any conchmbns which aid the cIjMwifimtiol) of the 
groups represented, 

The normal curve may bo accepted as the one which iihufwt invariably giw« 
an adequate description of tho distributions in the jKjjndatitmH minfdcd, ewn 
though some of these populations are unsuitablo for the purjawo in view, It » 
usually found to effect this purpose remarkably well, in spite of the fi«<t timt thts 
theoretical curve can never bo tlie correct one a» it is of iinlimitcsd range, In this 
application it is safest not to draw any inferenow regiinling the iHiminmitkm of 
a population from the observed fact that its distributions are rwrmal in (mm. 
Very occasionally a clear departure from normality can be taken te» it«Sit*al« tlml 
a peculiar and unsuitable population is represented. 

For practical purposes the most important ooncluaion to bo derived f«>m the 
forms of the distributions of anthropometric charaotera is that dimcction of the 
samples treated into subsamples which might be supposed to mptmml gmupe 
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of diiferent origins is impracticable. Anthropologists in general are unwilling to 
accept this conclusion, Other characteristics of the samples, such as their 
variabilities and correlations, also have a bearing on the question and hence 
discussion of it will be deferred for a later part of this paper. 

The writer is indebted to Miss M. L. Tildesley, Prof. E. S. Pearson and Prof. 
G. von Bonin for criticism which led to the improvement of this first part, 


APPENDIX 

Certain di.stridutionr oe cephalio indices provided bv 
.Fedix von Ld.sohan 

As far as tlio writor is awaro, tho (inly puhlishort (li.stril)utioii8 for any anthroppraotrio 
charactor which show a oloar cloparfeuro from normality, and which rolato to nnsoloctod 
aamploa mado' up by more than 100 individuals, ons thri'c providi’.d by Fulix von Lvisohan, 
Tlwso aro ofton referred to as ovidcncc that such dopiu'ture,s do occ\u' in practice, though 
usually without comment on tho rareness of such nn oocnmmco. All tho distributions in 
question aro for tho cephalio index, and they all relate to groups of inon nativo to tho 
Noar East. 

Tlio first is for 179 mon who called thomsolvos GrooUs inoiusurod in Lyciaandnoiglihouring 
localities in the south of Asia Minor, including a few islands. Tho figures for tho cephalic 
index wore published (von Luschan, 1890). A frequency diagram in tho same pap(*r does 
not agree with those figures, as it shows one index of OS while tho highest in tho table is 94, 
This error was perpetuated when tho distribution was reproduopd in two later publications 
(von Luschan, 1911, 1927). The frequencies are shown as histograms in the top diagram of 
our Fig. 3A. 

The second scries is mads up by 7 56 Turks from tho south of Asia Minor and tho north 
of Syria. Tho figures for 187 of those raon are available (1890) but not those for tho ro. 
maindor, A frequency distribution ‘'reduced to ono-third” was given (1911, 1927). It is 
not possible to road off tho individual frequoncu's from this with absolute accurooy, but tho 
middle diagram of Fig. 3 A reproduces them os closely iw possible, With referenco to this 
distribution, von Luschan says (1911, p. 236) that it rangivs from 69 to 90, but tho higlnwt 
value given in his diagram is 92. 

The third series is of 1222 Jews; “52% of these were iSephardim, whom I monsurod at 
Smyrna, at Constantinople, at Makri, and in Rh(KUi.s; the rest wore Ashkonaisim moaaurod 
by myself. ..at Vienna, Austria" (1911, p. 226). A frequonoy distribution “reduced to 
one-fifth ’’ is given and this is reproduced as accurately as possible in tho bottom diagram of 
Fig. 3 A. 

The first and third of these throe distributions are clearly bimodal in fonn. That for tho 
760 Turks provides less clear evidence of tho existence of more tlian one mode in tho 
population distribution, but it certainly suggests that this must have boon clearly asym- 
metrical. The three distributions provided by von Luschan relate principally to populations 
of Asia Minor, and their poouliaritios might be supposed due to the fact that this region is 
inhabited by peoples who are racially heterogeneous to an unusual extent. In fact, how- 
ever, other Series from it do not appear to be at all distinguished by any unusual forms of 
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tlieir raotrical chanuiti'w, Mi'twumnimta fur llin'f i4' hi* » Hi** 

populations fliscussi'd by vtm busrban ua cltiM-ly as any iivui!i»I»!*‘ iir*- l»> ; 

(i) ffoophytoa (IHlU): 142 (Irfi'lw in ilia f«iwn arni iii'iKbi«*iHlc»s»I »il l%»i' ■*>(« m (Iw 
north-oast of Asiii Minor; 

(ii) Hauachild & Waiionsoil (19:!1): dfl.'i 'I'tirks from a nUiidrr *4 btctisSs* > si. ,\iirt*5diit; 

(iii) Wagonsoil (1112*')): 142 ,lt'W,s tnonMiiro.l in (’(itislait(ni*i)(i'', t<f U<~ ?*■ tn 

that town, 13 canio from various rounlrirs of KiiMlorn KnrojH* a(i*l th'* frtiiion*!* j u* re Iron; 
Asia Minor and Syria. 



A. Throe distributions iirovidod by F. 
von Lusclian, 


Fig. 3. Distributions of the oophnlir index for wriw of men mi-nwHrt’t! jh tltp Krar K< 4 *s. 


The frequency diatributions of the cejilialic index for tltesn,' tbps* »*n**w wti" jiiviii in 
Fig. 3B, and they all differ markedly inform from Ihecom'sjKmdtrigdistrilflifioiw by 
von Luschan’s data. That for the 142 Greeks is irregular, bnl Idtlo nm 1«> 

attached to this fact as the sample is not a large one: it shows no iiKiir**»i bd«w «». whifo 
more than half of von Luschan’s Greeks Inul indicia Wow t his value. The llll•^^*urvnt«•nl}« ft»r 
Anatolian Turks and Constantinople Jaws do not suj^rat that the jwpHltttiun dwtribui iona 
showed any clear departure from the normal form. 
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The corresponding pairs of series are also distiuguwhod chjarly by thoir variabilities, The 
following standard deviations arc found : 



Greeks 

Turks 

Jews 

Von Lusohan's series 
Other series 

6'96-(-0-.37 

3-21+0-19 

6-20 + 0’r3 
4'76±0d6 

6i72±0'14 

2'90±0'17 


In comparison with other material, the abnormally high standard deviations of the 
cophalio index for von Lusohan’s thro© series show at onco that tho samples are of a peculiar 
kind unsiiitablo for racial comparisons. This matorial is altogether exceptional and it does 
not affect tho general conclusion that tho distributions of metrical characters for samples 
treated in anthropological practice almost invariabty indioato that the population distri- 
butions do not diverge appreciably from the normal form. 
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1. Introduction 

The human skeletal remains which form the subject of this report were excavated 
by the Wellcome (later the Wellcomc-Marston) Archaeological Research 
Expedition to the Near East from 1933 to 1936. They were collected by the late 
Mr J. L. Starkey, who was then the Director of the Expedition, and it was 
owing to his enthusiasm and untiring efforts in the field that so much material 
is now available for study. His tragic death in 1938 deprived archaeology 
and physical anthropology alike of one who was chiefly responsible for the 
collection of some of the most valuable evidence extant relating to the early 
history of the peoples of tlie eastern Mediterranean. 


7-2 
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The human remains from l.milimli were brought to Kitglawl fuul the greater 
part of the work on them was carried nut. at tiie (Jalftm halim-rttory. ( iiivci'wif y 
College, London, under the Hupervision of Dr (1. M. Mnmnt. I ant indelded to 
the Trustees of tlie late Mir Henry Wcllcotim for a grind wliieh intsdiled me (o 
carry out this work over a period ofratlujr more than (liree venn*. ami ako ft»r 
a subsidy which Itiis made jaissible the complete present at ton of the iliithtrativc 
and tabular matter collected for the report. 1 also ow<‘ « very 0 **ht dtdd <jj 
gratitude to Dr Morant for his tmtiring help ami advice in all the stages of its 
preparation. A complete report (if the e.xcavatioii.H in in conrae nl pr(‘piiration, 
and I have to thank Mr Charles luge, the present Director of tin* KKiHaltfion, 
for most of the following jiarticulars relating to the discovery of the htnics. 

d’he only puhlished aticount of tliem is a no! e by Mr Kfiirkey i; 1 d.'ttii, pridiO-cd 
to a description of three trejianned .skulls hy Dr Wilson I’lirry, He reptnlH (iiat 
a roughly (drcular chamber (No. 1(17), which contained a, deposit ot iminan 
remains mucli damaged by (ire, was ojHmed in Ddtt. At the same time an 
adjoining and larger reidangular cavern (No. 12n) was iociitcd, Starkey writes: 

“The top layer consisted of many animal hones, mostly pig. and this ndtiHe 
should he aserihed to tiu; latter half of the dudeaii kingdom.. . .The lower or 
main deposit consisted of a mass (jf Immatv hones, the remuiiiH of at Iciwt l.liKt 
bodies. As tliey w'Ore pitched in through the hole in tin* hroken the .*4idlw 
rolled down from the apex of the pile to the Hides of the cIiamlHT.'* 

Sherds of pottery intermixed with the homw can all he assigiunl to the 
seventh and eighth centuries me. 

“Some bones wore partially calcined, suggesting that they were ah,s»rart('tl 
from burnt buildings,, , .Carofnl supervision of the clearnnce failed to mtahlish 
that any crania were in articulation with vertebrae, and the jaws were rarely 
attached: in fact, no order was seen in the jumbled maw,” 

It is suggested that the ossuary was probably connected with the wdvage of 
Lachish after its partial destruotion by .Sennacherib, King of Assyria, in 7!»1 u.o. 

“ When floor level was readied it became clear that, the tomb had bren 
previously used as a dwelling, a door had been cut at the K.K. igiven errouMowly 
as N.W, in the published account] corner, eoimectJiig tin* KUialler circular t'omh 
with it, which contained the 501) bodies diseovertnl in 11)34, From tlm fttyle of 
both chambers it is certain that they were originally excavated to nmtaiii early 
fifteenth'Century burials.” 

Excavation of the area was extended and two other tornhs were found, of 
which no details have jmt been published. Fig. I is a plan of the area in qncHtion, 

showing ail four tombs from which the human remains treated in this stwly w ere 
obtained. 


The archaeologists report that: 


Ail the tomb chambers were adjoining and intercoimeeted. and the 
deposits m Nos. 107, 1 08, and 120 were identical and a« described by Mr Ntarkey . 
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Tomb No, Ilf) is in a slightly different category; it is an artificial cavern, with 
a small entrance cut in the south wall of No. 108 near the bottom, containing 
normal burials comparatively undisturbed. The period however is the same.” 

It i.s said that the later discoveries do not provide any evidence which makes 
closer dating possible. They include a Cypriote juglet, of a type common in 
Palestine in the eighth century b.c., scarabs and scaraboids, bone pendants, 
and blue glaze amulets. The last include representations of Egyptian deities 



Fig. 1. Plan of fclic tombs at Laobish froni which the luiiuaii romaitis wt'rc recovcrwl, The broken 
lines indicatoapproxiinatcly the llooramiof Tomb I2!),uinl the outline of tlm benches (innorline) 
and the maxinuim iKirimeter of Tomb lltt. 'I’ho appro.ximate heights of the tombs wore,- 
107, 2 in.; HIH, 2-2 m.; 11(1, Ml m.; and 12(1, 3-ll m. 

which were long popular in Palestine. There is nothing in the objects that one 
would be surprised to find in normal burials at Lachish in the middle of tlie 
Jewisli period. 

Eurther particulars regarding the state in which the human remains were 
found are available. In all tlie tombs, except No. 116, there were distinct layers 
of animal bones above the human deposits, and in the shallower part of Tomb 108 
they came up to the roof level, the roof itself having been broken away. In 
Tomb 107 the apex of the pile of human bones was about 1 m. above the floor 
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level, and the same height was about 1*3 m. in Toml. liin. 'rimrn vv..re fmrr 
human remains in Tomb lOH: these were more Hcattered. and there were imuiy 
animal bones above them. 

Tomb 116 is in a different category from the others, for it eontainm} aeveral 
formal burials laid out on the benches, witli the nsiml funerary emiiinneid. 
These burials "were partially disturbed, and the skulls could not be distinguished 
from others which may have rolled in at the time of the tilling tif IVunb Um. 
The roof of Tomb 116 was intact, and most of the long bones <'olhHied come 


from it as they were the best preserved. 

The estimate that the remains of loOn individuals were reprwwited srpplicH 
to Tomb 120, and it is admittedly very rough. It was )mpo.'«siblc f« judge 
whether the piles of bones included the complete Hkcletal renudns of I, nth) 
people or not, but the iinpreHslon was gairnsl tliat the skclcfon t>f the head wrt« 
far better represented than tliat of any t»ther pari, of the body. A tew tnandiblcs 
attached to crania were found, but there ap{>earcd to 1 k ‘ no othi*r vimm ot jusrts 
of the skeleton existing in proper artieulation. A Hugg<?sU*d explanation of the 
circumstances which led to the existence (tf the ossuary is given in the following 
section of this paper, as evidence derived from the bmiw prcwerved has a Ijearing 
on this topic. 

Plates I and 11 reproduce photographs taken in the interit^r of Tomb 1211 
after the clearance down to floor level. The piles of skulls which had accumulated 
round the walls are shown in situ. Plate II B is a close view of a suiall group iif 
skulls seen to the left of the beam in Plate I a and above the pick in Plato I «. 

Other human remains from a number of tomlw at l^chish, reprc«*ntiug 
fewer than 100 individuals in all, have been preserved, and the writer is jire- 
paring a separate report on this material. 


2. The generae nature or the remains anu remarks 
ON THE origin OF THE OSSUARIRS 

The human remains described in this report comprise all that wen* ftfwm'erl 
from the four tombs referred to in the foregoing stKttiom They **-«? nowlo up 
principally of crania, which appeared to be more abundant in the dcfs'slte than 
any other parts of the skeleton, but there are also series of mamliblw ami Jong 
bones of the limbs, and a few other bones, A selection wm made of tlio Imtter 
preserved remains, and preference was given to crania. It should k* rtwliwd 
that the relative numbers of different parte in the collection prwerved cannrd be 

supposed to be proportional to the frequencies with whi{?h they actually ocmirrwl 
in the ossuaries. 

The specimens were dipped in paraffin wax in the field, and the cleaning of 
them m the laboratory was a laborious task. While it was being earned out , four 
crania distorted by earth pressure were found, as well os a certain numlw of 
ragments of other parte of the skeleton which may have belonged to indlvidwala 



D. L. Risdon 


108 


included in the series, or to individuals not otherwise represented. All this 
material was discarded, and it is not counted in the totals. The numbers of 
specimens, most of which are more or less incomplete, from each tomb are given 
in the table below, no distinctions of sex or age being made : 



Tomb 

107 

108 

116 

120 

Totals 

Crania 

74 


46 

607 


Mandibles 

12 


7 



Femora 

2 


60 

28 


Tibiae 

2 

1 

27 

16 

46 

Fibulae 

— 

— 

1 

— 

1 

Humeri 

2 

— 

23 

23 

48 

Radii 

1 


6 

8 

14 

Ulnae 

3 

3 

3 

6 

16 

Sacra 

— 

— 

— 

2 

2 

Claviolcs 

4 

— 

2 

1 

7 


There are also a few vertebrae associated with skulls. 

The absence of other bones of the skeleton which would normally be well 
preserved, such as those of the wrist and ankle, should be noted. In this con- 
nexion Mr Inge remarks: “Less attention was paid to the skulls of children 
owing to their supposed smaller anthropometrical value, and the same applies 
to parts of the .skeleton other than skulls and long bones.” 

None of the bones are known to be a-ssociated, except that ten mandibles 
and crania were preserved together, and arc undoubtedly paired. It is probable 
that a considerable proportion of the other mandibles do belong to the crania, 
but it is not possible to associate them. Experience has shown tliat any attempt 
to pair parallel .series of these parts i.s unprofitable. It was also impossible to 
associate the other bones with the crania or with one another. 

After separating the immature specimonR, the crania and mandibles wore 
sexed anatomically, giving the totals shown in the table below. Estimates of 





Tomb 


Totals 

107 

108 

110 

120 

Crania 

Adult 

33 

3 

19 

303 

360 (61-8%) 


Adult 9 

31 

3 

25 

216 

274 (39*4%) 


Immature 

10 

1 

1 

49 

61 (8-8%) 

Mandibles 

Adult (J 

6 

0 

3 

26 

34(44.7%) 


Adult SJ 

7 

1 

4 

24 

36 (47-4%) 


Immature 

0 

0 

0 

6 

6 (7-9%) 
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this kind, cannot be exact in all cases, of course, but experieiuio. Huggesls that 
80-90% are likely to be correct. The inclusion o{ a certain proportion ot niale- 
like, but actually female, specimens in the male .series, or vice versa, is not 
likely to affect statistical constants appreciably. It has been shown hy Martiii 
(1936) and Cleaver (1937), that isolated mandibles can he sexed anatoniically 
as accurately as crania. Particulars regarding the other bones oi the skeleton arti 
given in § 10 below. 

The proportions of the sexes and of immature individual.s are seen to he 
very much the same in all the tombs. The corre.sponding male and femaU* 
percentages in the table are not particularly ckise, but the .series ot mandibles 
is obviously too small to give reliable proportions. It is safe to imsurne that 
there are rather more males than females in the collection preservn'd. 'Phis doe.s 
not necessarily imply that the same was true for the total number of individuals 
deposited in the tombs, as the male skulls, being stronger on the average, wtmkl 
stand a slightly better chance of being preserved than female sjiceimtms. It is 
probable, for a similar reason, that the proportion of immature intUvidmUs was 
very considerably higher in the tombs. The proportions of men, woimm aiul 
children there appear to liave been very similar to those which would he expeetml 
in a normal cemetery population. 

Estimates of the ages at death of the adults are (liKcusHcd in § 3 lielow. It is 
shown there that the population was considerably younger than K\u*h as is 
normally found in ancient or recent cemeteries. There are very few aged sjand" 
mens in the collection. 

The condition of the bones throws little further light on the question uf hnw 
and why they were deposited in the tombs, but the following ohsorvationH may 
be noted in this connexion. There is only one skull (No. U)H, jJ, 'Tomb 12t)) with 
wounds which are likely to have been the cause of death. 'PhiK has pieccw of 
bone removed from both parietals (see Plate IIIb), and a long cut extending 
across the left temporal line. The other lines on the left parietal and extending 
across the sagittal suture, along which the bones are broken, may have himx 
made at any time after death. As far as can be told the damage Blmwn in tiie 
case of all the other speoimens occurred in the tombs, and after Urn ileatlw of” 
the individuals. The greater part of it may be attributed to the crushing wcigld 
of the piles of bones. 

Among the totals for all tombs combined, there are twenty-nine 
adult male, eight (2-9%) adult female, and two (3'8%) Immature skulk wlt!i 
wounds, mostly slight, which had healed completely during life. Tlims fre- 
quencies are not higher than those commonly found for excavated Herles of 
crania. 

^ In his account of the discovery of the reraaiim, Starkey notes that those 
in Tomb 107 were much damaged by fire, and that some bones in Tomb 120 
were “. . .partially calcined, suggesting that they were abstracted from burnt 
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buildings”. It i& probable that the great majority of the burnt specimens were 
too damaged to be worth preserving. In the series preserved there are only 
three skulls with burnt patches,* No. 486 ($, Tomb 107) has an extensive area 
on the occipital and right parietal bones affected (see Plate III a), No. 637 
($, Tomb 120) shows a smaller area on the occipital bone, and No. 681 (im- 
mature, Tomb 120) shows a small burnt patch above the right orbit. 

It may be noted, too, that there are three trepanned skulls (all males from 
Tomb 120), and a few artificially deformed skulls from the same tomb. There 
are two male skulls and one female skull which show definite signs of diseased 
conditions affecting the bone. Again, they all come from Tomb 120 but it must 
be remembered that the series from this tomb is far longer than that from any 
of the other tombs. 

The evidence relevant to the provenance of the human remains from the 
four tombs at Laohish consists partly of archaeological observations, and partly 
of deductions from the remains preserved. It must be remembered that the 
series available were selected from a larger bulk of material. Taking the four 
tombs together, the total remains in hand must represent rather more than 700 
individuals. It was estimated that those of about 1600 were present in Tomb 120 
when it was opened. As the dimensions of the piles of bones are known approxi- 
mately, it may be asked whether they could possibly have been made up by 
bodies, or skeletons, which were complete when deposited. Taking Tomb 120 
alone, the dimensions of the pile were roughly: height l‘3x7’6x8-6 m. (see 
Fig. 1), giving a volume of about 83 cu.m. It is possible that at least 1600 
complete skeletons, including some of children, were interred in a space of this 
size. The tomb was nearly three times as high as the pile of bones, and hence it 
is also possible that it could have contained a heap of 1500 complete bodies, 
which might be supposed to have subsided on disintegration. The animal bones, 
which formed a distinct layer over the human remains, may have been thrown 
in after such a subsidence. So far there is nothing to preclude the possibility 
of the interment having been either of bodies, or of complete skeletons, such as 
might have occurred in clearing a neighboming cemetery. 

The hypothesis that the remains were obtained from a cemetery— so that 
Tomb 120, and possibly N os , 107 and 108 os well , were really clearance pits— seems 
to be unacceptable for several reasons. In the first place, it is almost invariably 
found that among bones cleared from cemeteries, crania and femora are fax- 
better represented than any other parts of the skeleton, and in such a case a 
pile as large as that in Tomb 120, say, would have been likely to contain far 
more than 1500 skulls. After clearance, too, it would have been unlikely that 
any skulls and mandibles would have been associated, though a few pairs were 

* It waa found tliat the paraffin wax in which the bones had been dipped could be removed 
moat effectively with the aid of a flame, and although great care was exercised, a few specimens 
now show signs of slight burning, chiefly at the base of the skull, acquired in the laboratory. 
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found. Another argument againat cemetery clearance ia tfiat few KkcIe.trniH of 
children are likely to have been thrown in the toinba, on the hyjM)theHi8 c-oji" 
sidered, and practically all these would probably liavc been too cruMhed in thi^ 
pile to be worth preserving. Of the total number of skullH in the series, K‘H% 
are immature, including a fair proportion of children under 10 years of age, and 
it is unlikely that this frequency would have been so high after two st ringent 
selections of the material. The scarcity of aged specimens also tolls against the 
hypothesis of clearance from a cemetery. 

In view of the fact that the skeletons of children are far more easily damaged 
than those of adults, and that bones of women are more easily cru.siied tlmn thfwe 
of men, it is not at all unlikely that the proportions of immature to matnre, and 
of female to male, skeletons in the tombs, were very similar to those of the UHinj 
population of Lachish at a particular time. The rarity tjf aged specitnerw must 
be remembered, These facts seem to tell against the theory tliat the individuals 
in the tombs were massacred, but a stronger argument against it is t he fact ( hat 
only one skull showing injury which might have been the <'ause of rleafh is 
recorded. 

The most plausible explanation, which apimars to he in aceordaiU'e with all 
the evidence, is the following. It may bo supposed that some eattwtrtiphe, siieh 
as pestilence or earthquake, overtook the population of Lacldsh abotit the year 
700 B.c., and that a large proportion of the inhabitants were victims. Ordinary 
burial at such a crisis would have been impossible, and in clearing the town, 
some time after it, the underground ehamberti in (juestion would have lieen 
convenient depositories into which bodies could have been thrown. The fact 
that some were burnt is not surprising, as the cativstrophe may have ln«n 
accompanied, or followed, by a fire in part of the town. The fact tlud the age 
distribution is appreciably younger than would be anticipated for any cemetery 
population of the period is accounted for on the hypothesis considorwi. 

3. Remarks on the condition and anomalies (other than dkntad) 

OE THE Lachish crania 

Remarks on the Lachish skulls are given in the ap|>cndefl tables of individtial 
measurements. Owing to the incomplete nature of many of llm HiKscimtms, the 
totals observable in the case of particular anomalies are generally conahleraJtly 
less than the totals for the whole collection, and hence they have to be given 
separately in considering the frequencies of different conditiotw. Hkj total 
numbers of the skulls from different tombs divided into sexes for the adults, and 
the numbers of immature specimens, are given on p, 103 above. In general, the 
frequencies are considered separately in the case of the long series from Tomb 
120, and of the senes made up by combining the skulls from all the other tombs, 
bubdivision of the latter is inadvisable, as the numbers from all the tombs other 
than 120 are small. 
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Sutuns. Remarks on the coronal, sagittal, and larahdoid sutures are 
given in the tables of individual measurements. If no mention of one of these 
is made there, this indicates that it shows no sign of closing. Every specimen is 
complete enough to observe tiie three principal sutures. 'J'he approximate 
estimate of the age constitution of the series, which can be obtained from these 
observations, will be considered first. Prequencies are given in Table I. If 
a suture showed any sign of the beginning of closure, the .skull was counted in 
the second category (sutures beginning to close or partly closed). A suture 
was counted as closed if synostosis was apparent for at least the greater part of 
its length. 


TABLE I 

The age constitution of the adult Lachish and other series estimated from the stale 
of closure of the principal calvarial siUures {coronal, sagittal, and lambdoid)* 


Scries 

All HuturoH 

Sutures 
beginning to 
close or 
partly closed 

All sutures 
closed 

Totals 

d 

Lacbisli, Tomb 120 

Other tombsf 

TotalJ 

Kerma, 12tb and 13tli dyn. 
Gijseb, 28th-30th dyn. 
Parringdon St, Londoners 
WhitochaiKil, Londoners 
Spitalliolds, Londoners 

Hythcf 

123 (43'3%) 

20 (45d)%) 
149 {43’7%) 

lMll'3%) 

74 (374)%) 

11) (12-2%) 

21 (15'4%) 
103 (19-3%) 

23 {2()d)%) 

153 (r)3'9%) 

28 (49d%) 
181 (53-1%) 

83 {68-9%) 
103 (61-5%) 
100 (64-1%) 

85 (02-5%) 
301 (66-3%) 

62 (664%) 

8 (2-8%) 

3 (5.2%) 

11 (3.2%) 

42 (29.8%) 
23(11-5%) 

37 (23.7%) 

30 (22.1%) 
131 (245%) 

25 (22-7 7o) 

284 

67 

341 

141 

200 

16B 

136 

636 

110 


Lachish, Tomb 120 

154 (744)%) 

54 (26-0%) 

1) -- 

208 


Other tombs 

45 (70-3%) 

14 (23-7%) 

0 „. 

59 


Total 

11)1) (74-5%) 

OK (26-6%) 

0 - 

267 


Korma, 12tli and I3th dyn. 

40 {35-l%) 

51 (44-7%) 

23 (20.2%) 

114 


Farringdon St, LundoiKtrs 

U (4(h3%) 

8(1 (304%) 

20 (14-3%) 

203 


Whitechapel, Londoners 

0(1 (4(h«%) 

49 (34'8%) 

20 (184%) 

141 


iSpiUdlleWs, Londoners 

121 (4l)'4%) 

94 (384%) 

30(12.2%) 

246 


Hytlus 

51 (6i)-3%) 

26 (30-2%) 

0 (10.5%) 

80 


* The (lata in this table for all the series other thai\ th(( Laehish are taken from KttK^ssiKcr k 
Morant (11)32, p. 170) and Collett (11)33, p. 2511). 
t Comprising Tombs 107. 108, 118. 

f The posthumously and artificially deformed Lachish skulls, and those not included in the 
pooled series on account of the fact that they show premature closing of the sagittal suture, as 
well as No. 380 (premature, closing of the tsoronal suture) and No. 382 (of unusual form), are not 
included in tlie totals given in this table. 


It should be noted that all the data relating to sutures refer to their ecto- 
cranial aspects. The endocranial surface of the brain-box could only be observed 
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in a few specimens, as cleaning was imposai!)Ie, and lieiice no remarks on the 
inside of the skull were recorded. 

The division of the adult skulls into the three categories shown in 'I'ahle I 
was made with the primary purpose of obtaining a rough estimate of the 
distribution of age of death Ihr the individuals whose skulls are preserved. It is 
known that there is considerable variation in the ages at which the. ditTerent 
sutures close; a skull with the sutures partly dosed, fetr exaniple. may well 
have belonged to an older individual than one with all sutures open, 'flm actual 
age distributions for the separate groups, if they could he known, woidd dmihtlcss 
show very considerable overlap. In spite of this it is safe to assume, that the 
percentages may fairly be compared with those given for another series observed 
in precisely the same way, in order to determine difTerenecK in tlie ag(i constitu- 
tions of the samples. The data for the other scrie-s given in the tal)U* were 
actually collected by l,)r Morant, who cotdirmed the i'utst that my records lm<l 
been obtained in a comparable way. 

There are very clear differences between the peis’cntages for tlu' Ltvehish 
skulls and those for the other series, while the currespouding values for Tomb 12U 
and the combined series from the other tombs are rcmarkalily dose, 'riie visual 


TABLE a 

The frequencies of different orders of closing of the principal calrnriat 
sutures foi' the Lachish skulls ((dl tombs)* 


Sagittal closing first, coronal and lambdoid oiwn or closing together 

Sagittal closing first, coronal second and lamtoiid last 

Sagittal closing firet, lambdoid second and coronal last 

Coronal closing first, sagittal and lamMoid oj)on or closing logetlicr 

Coronal closing first, sagittal second and lammloid liiat 

Lambdoid closing firet, sagittal and eoronul oi>eii or idofling together 

Lambdoid closing first, sagittal aocond and ooKimvl lost 

Sagittal and coronal closing togothor before lamMoid 

Sagittal and lambdoid closing togcUn'r Jx'rort,* ooromil 

All three sutures closing together 


Totals 


' 


« 

-j 

it 

m 

m 

20 

a| 

17:;! 

2 

21'«! 

24 

3 

« 

2 


1 

0 1 

37 

I’i • 

11 

1 

7 

2 

' 203 

■ i 

74 i 


* Excluding the posthumously and artificially deformed, but including tlw? rdoven iimh and 
six female skulls sho mg premature oloaing of the sagittal suture, and No. 380, ihowiug premutwre 
'closing of the coronal suture. 

t Including eight showing premature closing of the sagittal suture, 
f Including five showing premature closing of the sagittal suture. 

§ Including one showing premature closing of the sagittal suture. 

1! Including three showing premature closing of the sagittal suture 
^ Ittoludlng No. 380. 
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sexual difference, due to the fact that the sutures close at a later age for females 
than for males, is found. It is evident that the average age at death must have 
been considerably less for the Lachish adults than for any other of the groups of 
people. All the other series, except the Whitechapel (obtained from a burial pit, 
probably used in time of plague), are believed to represent ordinary cemetery 
populations. 

Statistics relating to the orders in which the three principal calvarial sutures 
were closing are given in Table II. It was possible to observe these orders in 
the case of all the adult skulls for which closure had commenced, except one 
aged male (No. 357) in which all three sutures are completely obliterated. The 
special series of adult specimens presumed to have been affected by premature 
closing of the sagittal suture, and the one (No. 380) distorted owing to premature 
closing of the coronal suture, are included in Talde II. They were omitted 
from Table I, as their anomalous conditions might give fallacious estimates of 
age at death. 

The figures show that for tlio majority of skulls the sagittal suture began to 
close before the other two, and that the coronal normally closed before the 
lamhrloid. The order sagittal-coronal-lambdoid is also found with the greatest 
frecpiency for the Kerma Egyptian series (dollett, 1933). The fre(]uencie8 for 
the two series may be summarized in another way, as follows, only skulls for 
whu'ii the first suture to close can bo observed being imfiudod ; 






Laohitih 

Kornia 

Lachifili 

Kerma 

Sagittal plnfiing before other two 
Ckironal eloaiiig before ntlier two 
Laiubdoicl closing before other two 

120 (8M%) 
24 (16-2%) 

4 (2-7%) 

75 (79-8%) 
18{1)M%) 

1 {M%) 

28 (47'5%) 
30 (5lb8%) 

1 a‘7%) 

34 (88'0%) 

14 {28-0%) 

2 {4-0%) 

Totola 

148 

94 

59 

no 


The values of the corresponding percentngos for the male series are finite 
insignificant, hut differences which must he considered significant are found 
for tlu! first two jiairs of percentages in the case of the females, It should bo 
pointed out that the cpiestion which suture shows a more advanced stage of 
closure than the other two may depend on the age of tlie individual, and if the 
rates at whieli tlie different sutures close are markedly different comparisons 
between the frequencies considered for different series may be misleading. 

For the English series, examined in the same way, the sagittal suture was 
found to be the first to close, almost invariably followed by the coronal and 
lambdoid, which appeared to close together. For a Negro series the coronal 
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was found to show a slight tendency to close before the sagittal, 'rhero are 
certainly racial differences with regard to the order in which the calvarial sutiircH 
close, and, as far as can be seen, the Lachish series agrees with Ancient Egy{)tian 
in this respect. 

It can be seen from Table II that there are only four male and one hmude 
skull showing the lambdoid suture closing before either of the (sthers. An 
examination of these specimens shows that their suturcn wore definitely eloRing 
in an anomalous way in the majority of cases. Of the four males, Nos, lH4, 265 
and 330 show the occipito-mastoid suture oblit-eratcd on both sides; No, 184 
also shows the parieto-mostoid suture obliterated on the left. No. 265 hIiowh the 
temporal squama completely fused to the parietal on the right, and partly fused 
on the left; and No. 330 shows both parieto-mostoid sutures obliterated and the 
temporal squamae largely fused to the parietals. The otlusr male skidl with the 
lambdoid suture closing first (No. 268) has all the sutures lietwceu the temporal 
hones on the one hand, and the occipital and parietal, on the otlier, opmi. ‘'ITio 
female specimen (No. 619) has the occipito-mastoid BUt\m'H nlditcrated. and 
the posterior parts of the temporal squamae fused to the jiaricl als. 

There are five other male skulls, and two other fcmal(\ showing obUtcration 
of the par ieto -mastoid suture, and/or partial or complet e fusion of tlu! temporal 
squamae to the parietal bones. In these cases the principal i>alvarial sutures 
are open or closing in a normal order, as far as (;aii be seen, ami out* (No, 357) 
have them obliterated. The specimens in question are : 

No. 90, $. Occipito-mastoid suture L oblilornted, and posterior half of 
temporal squama L fused to jiarietal. 

No. 171, $, Occipito-mastoid suture L obliterated, and posterior half of 
temporal squama L fused to parietal. 

No. 230, d- Temporal squama R largely fused to parietal. 

No. 262, Occipito-mastoid suture L and parieto-mfwtoid suture L ohiifer- 
ated. 

No. 357, d. Temporals completely fused to occipital and parietal boites. 

No. 383, ?. Occipito-mastoid sutures. R and L, and parioto immtoid KUture A 
obliterated. 

No, 478, Occipito-mastoid sutures closed, R and L, temporal Kquaintw^ fuwd 
to parietals, R and L. 

There is one juvenile specimen (No. 680) showing the right parieto-mastoid 
suture obliterated, but all others open. 

In all there are six male skulls showing the temporal squamae comjikHcly, 
or partially, fused to the parietal hones, tliree on both sides, three an the left 
only, and one on the right only; and there are two female skulls showing the 
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game (’onditifm oti hoth Hides. A ('otinfc was made of the number of cases showing 
tiie oroipito-masloifl suturo amiplctely obliterated on one or both sides; the 
totals arc: 


1 ‘ ' ■■ ■ 

11 ifiht 

hit 

Righfe 

Total no. of adult 


and left 

only 

only 

skulls examined 

} * 

f! 

K 

I 

3fil 


(•1 

0 

1 

272 


When the condition is unilateral, it appears to he shown more frequently on 
the left than on the rigid side, hut the numbers are too small to warrant a 
generalization, 

Premature closing of the sagittal suture wivs noted in the ease of eleven male 
and six femak' adtilt skulls, all coming from Tojid) 120, These were not included 
in the series used for comparative, purposes, Init individual measurements for 
them are given in a Kcparnte section of the appended tables. In the case of .six 
male, and four female Hpccinmns the sagittal suture is completely obliterated, 
and the coronal and lanibdoid completely open. In tlio case of the remaining 
five male, and two female specimens the sagittal is completely obliterated and 
one or both of the other two sutures are closing or closed. Few of these skulls 
show any (dear sign of distortion owing to the abnormal closure of the sagittal 
suture, the only exceptions being No. :i(I4 (male) and Nos. 1567 and 072 (female), 
these having tlie lowest cephalic indices. Tiio throe can safely be called scapho- 
ceplialifi, hut tlm forms of tlio others are not exceptional, and hence it may be 
presumed that their sagittal sutures did not become synostosed until growth 
of the hrain-box w'as nearly completed. The cephalic indices of all the specimens 
showing premature obliteration of the sagittal suture compared with those of 
the normal series (Tombs 107, lOH, 11(5 and 120 combined) are as under: 





6()'05" 

66d)6- 

70'0r. 

7()'0a- 

7m 

76'1)6- 

80-06 

80-06- 

86-06 

86-06- 

1)0-06 

Totals 

d 

Normal series 

1 

2(1 

1(K) 

112 

11 

0 

310 


Fmfflftluw filosing of 
sagitlftl suture 

1* 

1 

7 

2 

0 

0 

11 

$ 

Nontiftl wries 

0 

n 

112 

117 

16 

1 

282 


iktunature closing of 
sagittal suture 

L. _ 

1* 

■ 

3 

1 

0 

0 

6 


* Accepted as acaphooephalic, 

There are no brachycephalic skulls among those in the special series, but the 
indices for the majority of them are clearly not abnormal. The fact that all the 
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adult specimen^ showing premature closing of the sagittal suture came from 
the same tomb (No. 120) is suggestive. It may well be that the condition is of 
genetical origin, and that the affected individuals wore related. This explanation 
would account for the high proportion of specimens exhibiting the anomaly. 

One other female skull (No. 577) has the posterior half of th(5 sagittal suture 
completely obliterated, but the anterior half, and the coronal and lambcloid 
sutures, completely opeji, The .specimen alscj shows a distinct post-coronal 
depression, though this is not apparent in the median sagittal plane. 

There is only one other skull (not included in the normal series) which 
apparently shows deformation resulting from premature closing of a suture. 
This is No. 380 (Plate VIIf); its coronal suture is almost obliterated, while the 
sagittal and lambdoid are eoznpletely open. This is proluddy a mwe zjf oxy- 
cephaly, and, though the deformation is not unlike that which was clearly 
produced artificially in the case of certain specimens, yet all those showing 
artificial deformation most clearly have the coronal suture completely open. 
No. 380 has a height-length (100 WjL) index of 83'3, whereas the highest for 
a normal adult male skull of the series is 80' 8. 

No. 59 (male) shows the superior half of the coronal suture on the rigid, side 
obliterated, but other parts of this suture ojren, while the sagittal is hogiiining 
to close and the lambdoid is open, It does not ajzpear to he deformed, flu? height- 
length index of 75-1 being rather high but by no means extreme. No. 584 
(female) shows the right side of the coronal suture, closed, \vhile the left side 
and the sagittal and lambdoid sutures are 0|>en. 

The complete metopic suture was found for twenty-six td' the 341 male 
adult skulls included in Table I, and slight traces of it were observed in a few 
other oases; it was also found for twenty-two of the. {iorresiKuiding total of 267 
female specimens. There are nineteen of the male, and twenty-one of the female- 
affected individuals showing the metopic and sagittal suturea either both 0 {>en, 
or closing together. In the remaining female, ajid six of the remaining male 
specimens, either one or the other suture is dosing first, but the tlifference 
between the state of closure of the two is only slight. Tlie greatest tlifferem-o in 
this respect is shown by the last male 8i)ecimen, winch has the sa-gittiil HUture 
closed and the metopic open. The conclusion derived from earlier jnateriid that 
the metopic suture normally closes about the same tizne tm the sagittal, if it 
persists to an adult stage, is thus confirmed. 

The male percentage for the condition is 7-6, and the female 8-2, These 
values are higher than those for the Kerma Egyptian (4-5 male, and 6-2 femalu). 
but rather lower than the percentages usually found for European series. Of the 
sixty-one immature skulls from Lachish, there are eleven (18-0%) showing the 
complete metopic suture, an appreciably higher proportion. Of the eleven male 
•adult skulls not included in the normal series on account of premature doaing 
of the sagittal suture, one is metopic, and of the six female in the same group 
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two show the condition. Of the seven male skulls which show patent artificial 
deformation, or a suspicion of this condition, two are metopic, and of the two 
female, one is metopic. The two specimens which are most clearly deformed 
(Nos. 381 male, and 673 female) are both metopic, which is curious. 

A few other sutural anomalies, besides those treated under supernumerary 
bones below, were noted. Among 267 female skulls, twelve were found with traces 
of the suture between the ex- and supra-occipital bones, but no trace was found 
in any male specimen. Only one example of a complete suture across a malar 
bone was found (No. 154 male), the right malar bone being divided and the 
left normal. No. 216 (male) shows incomplete fusion of the basi-occipital and 
basi-sphenoid, although it is almost certainly adult. No. 491 (female) shows 
the suture between the frontal and sphenoid bones obliterated on both sides, 
although the coronal suture is open. Several examples of fused nasal bones are 
noted in the remarks. 

The region of the pterion and supernumerary hones. Tlie region of the 
pterion was examined on all the skulls, and the totals for which the sutures 
there are visible are given in the table below. Epii)teric bones are recorded in 
the remarks on individual specimens, and, as usual, they are found to be diver- 
sified in number and size. The numbers of 8pe(!imen8 showing contact between 
the temporal and frontal bones are given in the following table, the right and 
left sides being considered separately : 




!; 


Immaturo 


R 

L 

B 

L 

B 

L 

Total no. of skulls with sutures at pterion 
visible on the side in question* 

257 

258 

136 

138 

01 

61 

No. of skulls with fronto-temporal artieulation 

2 

3 

3 

3 

1 

2 

No. of skulls with pterion in K 


“““ 

1 

2 




+ For Ihc) normal soricw (all Lomba), oxohuling tlio poathunioualy and artifioially deformed 
skulls, those showing \)nmiature elosing of the sagittal or (loronal suture, and No. 382. Among all 
tluao sjMioimtins oxeludod there is one nose of contact between the temporal and frontal bones. 


Of the four male skulls with fronto-tomporal articulation, one shows the 
condition on the right side only, two show' it on the left side only, and the last 
shows it on both sides. Of the five female skulls, two show the condition on the 
right side only, two on the left aide only, and for the last it is bilateral. Of the 
two juvenile specimens, one has fronto-temporal articulation on the left side 
only, and for the other the condition is bilateral. Of the two female specimens 
w'ith the pterion in K, one case is on the left side only, and for the other the 
condition is bilateral. The low frequency of cases showing contact between the 
temporal and frontal bones is comparable with those recorded for European 
Biometrika xxxr 8 
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series. One of the male skulis simpooted to be artilirialiy dofnrniod .sI.owk front o- 
temporal articulation oji both sides, Init all the otlicrH have normal ptnria. 

Examples of one or more rvonnian bones between the tempttral Htjuaniae 
and parietals were observed with a frequency whi<di appears to lu; uiniHuully 
high. There are ten cases of the condition exhibited imiiaterally. and live 
bilaterally, among the male skulls; four unilaterally and four bilaterally fur 
the female; and one unilaterally and one bilaterally among the immature, 
specimens, 

The normal series of 341 male skulls (all tombs) iiichides, th(5 follnwirig cases 
of complete or partial division of tlie occipital bone : os ipac.hd ( preinterparinf al) 
2, interparietal bones 8 (only os pmtagonalc, and os trianijulure K separate 2; 
only os pent, separate 2; only, os iri. It .separate 2; only m iri. L .separate 1 ; two 
large bones meeting below lambda I— No. (iO, Plate NlVn), (nu'cs ol hori- 
zontal suture of interparietal bones near astoria II. The normir! Reries td' 2ti7 
female skulls includes the following cjuies: o.s epactal 3, interimricdal boiieH 2 
(only os tri. separate 1; only os pent, separate 1), traces of horizontal Huture 
of interparietal bones 9. The seriee of 01 immature spt'fimens ijndiide the 
following cases: os ipacial 1, only os pent, separate J. All tlusskidis nut inciuded 
in the normal series, on account of deh)rmation or for otln-r resHtinK, have 
normal occipital bones. The percentages of tlio ocmuTCnce of true interparietal 
bones, of one form or anotlior, are 2-3 for the male adults, and M-7 ftti* I he female 
adults. These values are close to those given for the Kmiui Egyptian wtrios 
(Collett, 1933, p. 2()r)), viz. 3'6 and 0-9%, respectively. 

One male and one female skull with an ossicle of bregimt were noted in the 
total series. Ossioles of lambda of varying sizes were far more frequent . In 
general the sutures appeared to be moderately complic-ated and wormian bones 
in the lambdoid suture are by no means rare. A few cases of supcirnmnerary 
bones in the sagittal and coronal sutures were noted, but these are all small, 
with one exception. This is a male specimen (No, 299), tlic sagitta! suture being 
closed and the coronal closing. There is a large siqKJrrmmerary bone ht the right 
side of the coronal suture above the pterion. It is rouglily trianguinr in Bhaia? 
(see Plate XIV a), with its maximum length over fdi mm. the inferior margin 
being broken—- and its maximum breadth 24 mm. Tlio suture hmnuUng this line 
anteriorly is symmetrically disposed witJi regard to the left half of the vimmul 
suture, which is normal, so the additional element appeara tt» orriipy juirt of 
the area normally occupied by the right parietal bone. 

Other anatomical anomalies. The rarest anatomical anomaly noteal in the 
whole series is a case of complete absence of the riglifc auricular paasa-ge. This 
specimen (No. 324) is male and Plate XV A shows the condition of the affected 
region. There is no opening in the bone taking the place of the right auditory 
meatus. The corresponding region on the left side is normally formed. Apart 
rom asymmetry in general form— the auricular region of the temj«r8.1 and 
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adjoining partss of the base of the Hkiill being more protruding on the left than 
on the right — there are no clear differences between the two sides of the base 
of the skidl. Hrdlieka (1932-3) lias discussed similar cases of complete congenital 
absence of the external auditory meatus and tympanic bone, with reference to 
seven American specimens, all of wliicli are affected on the right side only. 

An orifice in the right temporal squama of a male skull (No, 206) was noted. 
This is the end of a canal leading through the bone to the interior of the brain 
cavity. Another male specimen (No. 301) exhibits complete blockage of the left 
jugular foramen (see Plate XVn), the right foramen being normal. 

Wounds. Small healed wounds were found on several of the skulls, and they 
arenoted in the remarks given in the appended tables of individual measurements. 
One male .skull (No. 47) has a healed wound on the left malar bone, but all other 
injuries noted are on the cranial vault. The most severe examples of comi>letely 
healed injury are on three male — No. 5 (frontal bone, Plate XVIII a). No. 190 
(frontal bone), No. 301 (left parietal) — and four female— No. 419 (frontal, 
Plate XVlllc), No. 464 (frontal), No. 544 (frontal), and No. 670 (left parietal) 
— .skulls. Only oive speoimeii (No. 108, male, Plate IIIb), shows wounds whicli 
were probably inilictcd not long before death. (Pho outer table was completely 
removed in a region on the left, and another on the right, parietal : tlio edges of 
the affected area apj)arcntly show .signs of healing, hut the cliploe is still exposed, 
ff’here are long cracks in the vault of the same skull, but these are probably due 
to post-mortem injury. 

Dismstid and other conditions. I am indebted to Dr Ij, W, Proger and Dr 
A, M, El Katrawi for commenting on the following and some of the exceptional 
specimens previously described. 

A female cranium (No. 002, Plate XVII) has what appears to be a diseased 
area extending over the greater part of tlie riglit side of the frontal bone, All 
the bones of tlie. vault are .softened and in a bad condition, with numerous small 
cracks, and the area in question is raised above the general level of the outer 
table. 'Phe roof of the right orbit is also affected and the eiulocranial surface of 
tlie right side of the frontal l>om) is slightly roughened. The inflammatory 
(‘f)ndil,ion may possibly be duo to o.stc()niyclitis. Another male (No. I, Plate 
XVI II n) lias a roughened area on the parietal above the mid-point of the right 
silk*, of the lambiluid suture. 'Phevc seems to be no doubt that this is a patho- 
logical, and not a traumatic, lesion. 

A cranium, presumed to be male (No. 382, Plato XVI) is peculiar on account 
of the '‘swollen” appearance of its brain-box. This suggests hydroceplialy, but 
the bones arc not oxccptit)nalIy thin. The basi-occipital is unusually short and 
broad, so the condition may be due to achondroplasia. 

Artificially deformed skulls, fl’here are eight of the Lachish skulls, all from 
Tomb 120, which are clearly aitifioially deformed, or which suggest this con- 
dition. 'Phe most marked cases are Nos. 381 (male) and 673 (female), both of 
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which happen to be metopic, and they are both young adult (hoc I'lato VI). 
They are typical examples of fronto-occipital deformation, the two hones being 
flattened and a post-coronal depression being apparent in caijh case. IVunark.s 
on these two, and on the other six specimens, arc in the appended tahlos of 
individual measurements. Of the latter, two male speciimeus (Nu.s. 37(1 and 37H) 
show a clear suggestion of both frontal and occipital flattening, while tlu* former 
shows slight post-coronal depression, hut the latter does not.. Nos. 377, 379, 
380 (male) and 461 (female), show some degree of frontal flattening only, but 
the occipital bones appear to be perfectly normal; of the, sc No. 370 is the only 
one showing any suggestion of a post-coronal depression. No. 375 (male) .shows 
some degree of both frontal and occipital flattening but no post-coronal 
depression. Photographs of all these specimens, except No, 461. are reproduced 
in Plate VII. It is not possible to tvasert definitely that any of f hsun u ere inten- 
tionally deformed in childhood. No. 380 luvs the coronal suture nearly oliliter- 
ated and the sagittal and lamhdoid open, and it is po.‘^sible that il.s peculiar 
form is due solely to premature closing of the anterior suture. Aitifleiid defor- 
mation was extremely rare in ancient Egypt, if it was ever practi.st‘d tlicre at all. 
There are records of affected specimens of earlier date than the haeliish I'rom 
other parts of Western Asia, Crete, Cyprus, and some eoimtrics oi' Kastern 
Europe (see Dingwall, 1931). 

Trepanned skulls. The three trepanned skulls have been dcHtalhed by Dr 
T, Wilson Parry (1936), and new photographs of them arc given in Plates IV 
and V. He says that they are the first specimens exhibiting evitleiice of this 
surgical operation to have been found in Asia, and no otliers have been ;iis- 
covered since in the continent. On two of the skulls (No. 114, an ageing male, 
and No. 116, a young adult male) a quadrilateral of bone has been reitunaai !»y 
sawing, and it is said that there is no evidence that the primitivfj operation was 
performed in the same way in any part of the world except Peru, The third skull 
(No. 340, a young adult male) shows the results of an ojicratitm of a different 
type. It is suggested that the individual had a depressed fracture, and that 
following this a piece of bone, which had become partly free tm a result of the 
accident, was separated by sawing and removed. He Kurvivc'd long cnoiigli to 
enable the edges of the cavity to become conipletely healed, while thi' other 
two men must have died shortly after the operation. There appearH to be no 
recorded case of a trepanned skull from ancient Egypt, Dr Hatrawi ( 1935 , 
Plate XV) has given a photograph of a young adult female specimen of Mcrrutic 
age from Lower Nubia with a circular opening on the right side of the frontal 
bone, supposed due to a trepan. 
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4. Remarks on the condition and anomalies of 

THE JAWS OF THE LaCHISH SKHLLS 

Considerably fewer than half of the skulls in the Laohish series have the 
upper dental arch complete or nearly complete. Table III gives statistics re- 
garding the loss of teeth before death for the adult specimens having complete 
upper jaws, and for the smaller number of adult mandibles with the dental arch 
complete. The percentages of cases with no teeth lost before death are high, but 
it must be remembered that the age constitution, judging from the state of the 
sutures, indicates a younger gi’oup, on the average, than that expected in a 
cemetery population, The frequencies of adult upper jaws with one or both 
third molars absent are not unusual, but it is customary to find the female 
percentage greater than the male. The samples are too small, however, to give 
a reliable sexual comparison in this respect. Remarks on the condition and 
auoraalicfi of the upper jaws, and of the few mandibles associated with crania, 
are given in the apperuled tables of individual mea.surements. 


TABLE III 

The frequencies of different comlitions of the teeth in 
adult jaws having complete dental arcades 



Upi>or jaw 

Lower jaw 

-J 


3 


(i) AH teeth including third molara pre.s(uit at death 

01 

70 

13 

13 

(ii) Third molars apparently absent and no teeth lost 

6 

8 

1 

4 

before death 





(ill) One third molar apparently absent and no teititli 

5 

0 

0 

2 

lost before deatli 





(iv) Third inolurH erupted, or lM*lie,veil eruptesl, and 

r,r> 

2!) 

12 

12 

one or more teeth lost heforti deiitli 





(v) Third molara apparently aliwsit, and one or more 

4 

2 

3 

0 

t«!th lost before death 





(vi) One third molar only a])pan‘ntly abwsnt, and one 

1 

1 

1 

1 

or more, teeth lost ksfore doivtii 





Total no. of complete arehes 

132 

110 

30 

32 

Total no. with no teeth lost before death: (i), (ii) 

72 

78 

14 

19 

and (iii) 

{r>4‘5%) 

(70-9%) 

(46-7%) 

(59-4%) 

Total no. with one or both third molars believed 
unerupted; (ii), (iii), (v) and (vi) 

16 

(12-1%) 

(10'0%) 

6 

(10*7%) 

m%) 
















118 Cranial and other Human Remains from Palestine 

Any dental anomalies of special interest were examined l»y Mr ('. Hnwdler 
Henry, M.R.C.S., and he has kindly allowed me to incorporate his notes in the. 
following descriptions. The skiagrams reproduced in Plate.H XIX, XXI I and 
XXIII were provided by him. 

Deflected canines. There are four adult upper jaw.s with a canim^ on one 
side grossly deflected and not completely erupted. 

No. Ill, male. The unorupted right canine is misplaced and buried obliciuely , 
so that the tip of the crown is situated in the palatal alveolus, between the .socket.s 
of the first and second incisors, and the apex of the root is seen (uncovered by 
bone) in the external surface of the maxilla over the apex of the suckfit of the 
first premolar. The third molars were congenitally absent, and all the other 
teeth were present and in good condition. 

No. 142, male. The unerupted right canine i.s placed with the ti]) of its crown 
close to the anterior palatine fossa, and its root directed superiorly and prts- 
teriorly on the palatal side of the roots of the two premolars. [f is po.Hsihh'. that 
the left third molar was either abnormally small or else (iongcnitally ab.scnf . 
The region of the right third molar is defective. 

No. 401 , female : Plate XXI n. The anterior part of the dontid andi i.s damaged . 
and it is not possible to judge how many teeth were iircsent, The unmijited 
left canine is placed with the tip of its crown in the anterior palatine fossa, and 
its root directed externally, posteriorly, and superiorly, towards the malar 
process of the maxilla, and placed above the apices of the promolars. The right 
third molar is normal, but it is probable that the left tliird molar was reduced 
in size. 

No. 659, female. The left side of the palate and the part of the right side* 
anterior to the canine are missing. The unerupted canine is placed with the tip 
of its crown near the normal position of the alveolus of the lateral incisor, and 
its root directed externally, posteriorly, and superiorly towards the malar 
process of the maxilla. The right third molar was probably eruptetl and Umt 
before death. 

Supernumerary denticles. There are two crania and one mandible exhibiting 
this anomaly. 

No. 706, immature. The dental arch is symmetrical, with the loft third molar 
erupting and the right apparently congenitally absent. There is a supernumerary 
denticle buried in the palate behind the right central incisor (see Plato X X I c). 
It is lying obliquely, with its crown in the anterior palatine fossa, and its (ip 
impinging against the socket of the left central incisor. The condition of the 
residual root of the second right incisor suggests that this tooth had been broken 
during life. 

No. 383, female. All the teeth appear to have been present and in good 
condition at death. There is a small supernumerary denticle placed unusually 
far hack in the middle of the palate, to the left side of the suture (see Plate 
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XXII c, d). The skiagram shows that the apex of the crown is directed 
posteriorly. 

No. 437, female. In the mandible all teeth were present at death, and there 
are five dental rudiments embedded in the outer alveolar margin. The positions 
of these can be seen from Plate XX. In the upper jaw all teeth were present at 
death, except the left third molar which had been lost. There are diaatemae 
between the central incisors and between the lateral incisors and canines on 
both sides. There was post-normal occlusion of the jaws, and the teeth were 
markedly and iiTegularly worn. 

Diastemae. There is one cranium showing diastemae between teeth, in 
addition to No. 437 described above. 

No. 132, male. The third molars appear to have been congenitally absent, 
and all other teeth were present at death in the upper jaw. There is a diastema 
between the canine and first promolar on both sides (see Plate XXI b). 

Misplaced and missing teeth, and retained milk teeth. In addition to the four 
examples of deflected canines described above, there are four crania and two 
mandibles falling in this category. 

No. 72, male. The dental arch appears to be sufficiently roomy for the normal 
dentition, but crowding and some irregularity are present, due to the abnormal 
persistence of the right milk canine, and to the rotation of the left second 
premolar (see Plate XXIII c, n). The right third molar appears to have been 
congenitally absent, and the left is unerupted with the occlusal surface of its 
crown facing posteriorly and slightly laterally in the posterior wall of the 
antrum, and in the developmental position of the tot)th. The deciduous canine is 
a well-developed tooth retained in its functional position. Limitation of space 
has caused the permanent right canine to bo slightly rotated. The left second 
premolar is rotated so that its external, or buccal, cusp is antero-internal, and 
its internal, or palatal, cusp is postero-external. 

No. 445, female. No teeth had been lost before death, but there are only 
sockets for three in the right side anterior to the second premolar (see Plate 
XXI a). It is probable that the lateral incisor was raiBsing. The antero-inforior 
part of the inter-maxillary region is deflected to the right, and the anterior 
palatine foramen appears to be completely absent. 

No. 49(1, female. The upper left third molar is below the alveolar margin 
and ai)parently impacted (see Plate XXIIe). 

No. 500, female. The dental arch is well formed, and all teeth except three 
molars are normally erupted. The loft third molar was erupting at the time of 
death. The second and third right molars exhibit an unusual form of delayed 
eruption (see Plate XXII a, e). Neither tooth had emerged from the gum, 
although their direction and root formation appear to he normal, judging from 
the skiagram. The second molar is not impacted against the first, which might 
have accounted for the delayed eruption, but in fact they are clearly separated. 
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No. 1039, female. The left third molar of tlm mandible appeara to be 
slightly "over-erupted”, and it lias a backward tilt. In the region of the second 
and third right molars, which are missing, there is a deep pathological excava- 
tion, suggestive of an abscess cavity or cyst. The place of the second preniolar 
on the left aide, which was congenitally absent, is taken by the retained deciduous 
second molar. The third molar on the left has an abnormally deep pit where tlie 
limbs of the cruciform fissures intersect; the aperture is 1 mm. in diameter. 
There are two smaller cavities in the occlusal surface of the second molar on 
the same side. 

No. 1068, immature. This mandible had lost no teeth before death. Tlic 
canine and first premolar on the right are uneruptefl ami comiiletcly buried, 
the premolar lying obliquely towards the canine so tliat their crowms are in 
contact (see Plate XXIII a, b). A skiagram shows that the same teeth, unerupted 
on the other side, are in a similar condition. 

Diseased conditions of the jaws. Only the more marked cases of disstaKes of 
the jaws were noted. Notes on a mandible (No. 1039), with a large alwcaiss 
cavity or cyst, are given above, and there are three skulls and two mandibles 
exhibiting similar conditions. 

No. 14, male. All upper teeth were present at death except the lateral 
incisor and canine on the right side. The shrunken and obiirruited condition of 
the alveolus suggests that these two teeth might have been lust before death 
through traumatic injury. 

No, 467, female. Nearly all teeth had been lost before death, and tliere is 
a large cyst in the anterior part of the palate, penetrating to the musal aperUne 
(see Plate XXIII f). 

No, 469, female. All the upper molars had been lost before death, and there 
is a large abscess cavity or cyst in the molar region on the right side (see Plate 
X^XIIlB), 

No. 1017, male. No teeth had been lost from the riglit side of this mandible 
before death, the left side being defective. There is a large carious cavity in the 
posterior half of the first molar, which evidently involved the dental pulp. In 
consequence of this, an abscess formed which discharged through a sinus in the 
external alveolar plate, over the apex of the posterior root of the tottih. 

No. 1061, female. This mandible had lost no teeth before death, the tliird 
molars being congenitally absent. The bone of the external surface immediately 
above the left mental foramen is diseased. 

A'p'panni adventitious filling of a tooth. A female cranium (No. 618) is remark- 
able on account of the fact that a piece of metal was found firmly embedded in , 
and level with, the occlusal surface of the second right molar near its centre. 
The fact that the surface of the metal that could be seen was flat, and that its 
edges conformed to the surface of the tooth, shows conclusively that it must 
have been where it was found during the life of the individual Its appearance 
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was precisely similar to that of an artificial stopping. Plate XIX reproduces 
photographs of the whole palate and of the right molar region, before and after 
the removal of the metal, and a skiagram showing its depth. The filling was 
removed and its dimensions were found to be 1 x 3 mm. (exposed surface) by 
1 mm. (height). The cavity in the tooth, which shows no sign of disease, is almost 
circular in form, and its position, where the fissures between the cusps join, is 
one in which a small pit is sometimes found. The description of mandible 
No. 1039, on p. 120 above, refers to pits of this kind in the occlusal surfaces of 
a second and third molar. There is a small pit in the surface of the second right 
molar of skull No. 618. These circumstances suggest that the person accidentally 
bit the piece of metal, which became lodged in the natural pit of the second 
molar and worn down as teeth are normally. Such an explanation appears 
much more plausible than an alternative one which might attribute the filling 
to surgical interference. 

6. The .statistical hatttee of the mateeial 

The statistical nature of the Lachish series is discussed in this section, 
topics considered being the question whether the male and female adult and 
immature samples can ho supposed to represent the same population or not, 
sexual comparisons of average types and variabilities, and allied matters. As 
the series came from four tombs, which were adjoining, it is advisable to ask 
first wlietlier there arc any significant differences between the series from each, 
and whether it is legitimate to pool all the material for statistical purposes. In 
making all metrical comparisons, the specimens patently, or apparently, 
artificially deformed, those showing premature closing of the sagittal or coronal 
suture, and the one of unusual (? hydrocephalic) form were excluded. The 
numbers in these exceptional groups are in the table below, and measurements 
of the specimens, which are all adult, are given separately in the appended 
tables of individual raeasurementa. 


Tha numbers of anomalous crania 


Tomb 



9 

120 

Artifioially deformod 

6 

1 


Premature cbwiug of sagittal siituro 

11 

6 


Premature closing of coronal suture 

1 



Hydrocephalic? 

1 



When seen all together, the series from each tomb showed a variety of 
“types”, but no specimens which clearly stood out from the others on account 
of their characters were noted, with the exception of two males, viz. No. 166, 
Tomb 120, and No. 179, Tomb 116. Noma facialis views of these two are 
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given in Plate III c, n, and it will be seen that they are ol a very .siiiiilai tyj)e, 
which is distinctly different from that ol the male specimen (Plate X, right), 
which was selected because none of its measurements are at all p(iculiat’ (see 
p. 163 below). The cephalic index of No. 15(1 (81-7) is high, hut not exfremc;, 
and that of No. 179 (784) is le.ss exceptional. In spite of these two, it wius 
thought best not to exclude any .specimens from the aeries on ac(S)unt ol their 
appearance. 

In making male comparisons between tombs, it was nccicssary to jiool data 
from Tombs 108 and 116, as the series from them arc too small to stand alone. 
Male and female means are given in Table IV for various series from single tombs, 
or group.s of tombs, in the case of tluxse coeflicient of racial likcnc.ss characters 
for which the pooled means for Tombs 107, 108 and 116 are btwed on niox'e 
than ten skulks. It will be seen t})at the series from 'romb 120 is the (tnly one 
long enough to provide adetjuate stati.stical constants. The others are just lung 


TABLE IV 

Mean measurements of series of iMchish crania from 
different tombs 




Male aeries from tombs 


B’cmale BericM front tumlrn ; 







! 


120 

107 

108, lie 

107, 108, IKi 

120 

107, 108, 1 Hi 

i 

L 

B 

B' 

H' 

S 

pq: 

u 

fml 

fmb 

LB 

Q'H 

NB 

NH,L 

OX 

0,L 

0, 

100 BIL 
lOOfl'/L 
100 BjH' 
Oc.1. 

lOOfmb/fml 
mNBINH 
100 Oj/0„ L 
100 QJGf 

AL 

184'3 (266) 
137'1 (276) 
96-4 (263) 
133-8 (226) 
376-9 (221) 
308-7 (265) 
618-1 (266) 
37-1) (211) 
30-6 (213) 
100-6 (205) 
70-1 (76) 
26-2 (96) 
61-3 (103) 
41-4(117) 
32-8 (120) 
47-0 (72) 
40-4 (43) 
74-6 (269) 

72-7 (216) 
102-6(219) 
59-6 (238) 
82-9 (196) 
49-6 (88) 
79-3 (113) 
85-7 (33) 
64°-0 (70) 
73'’-8 (70) 

186-0 (32) 
136-2 (28) 
96-6 (32) 
133-2 (24) 
372-7 (20) 
307-7 (29) 
619-0 (27) 
30-4 (21) 
29-7 (20) 
101-3 (21) 
08-8 (14) 
26-0 (18) 
61-7 (20) 
41'8 (21) 
33-6 (10) 
46-3 (16) 
39-8 (8) 
73-6 (28) 
72-6 (24) 
103-2 (20) 
60-0 (26) 
81-8 (16) 
48-7 (17) 
80-4 (18) 
86-9 (6) 
64“-9 (12) 
74“-0 (12) 

186-0 (24) 
134-6 (23) 
96-1 (24) 
134-3 (18) 
372-6 (14) 
308-3 (22) 
617-1 (22) 

38- 0 (15) 
30-4(11) 

lOM (17) 
72-6 (8) 
24-9 (9) 
62-1 (13) 
42-0 (10) 
33-3 (13) 

46- 4 (12) 

39- 6 (6) 
72-6 (23) 
72-0 (18) 
100-8 (17) 

69-4(16) 
79-7 (11) 

47- 9 (9) 
78-3 (10) 
86-0 (6) 
62“-2 (7) 
74‘’-6 (7) 

185-2 (66) 
136'8 (61) 
06-9 (66) 
133-7 (42) 
372-6 (34) 
308-0 (61) 
618‘2 (49) 
37-1 (36) 
30-0(31) 

101- 2 (.38) 
70-2 (22) 
26-0 (27) 
61-8 (33) 
41-9 (31) 
33-6 (32) 
46-7 (28) 
39-7 (14) 
73-1 (61) 
72-3 (42) 

102- 1 (37) 
69-8 (40) 
80-9 (27) 
48-4 (28) 
79-7 (28) 
86-5 (11) 
63'’-0 (19) 
74‘’>2 (19) 

176-6 (2(8)) 
133-1 (2IW) 
92-2 (liKl) 
128-3 (166) 
362-9 (104) 
297-0 (183) 
5(K)-1 (170) 
35-8(161) 

20-0 (164) 

9(3-6 (169) 
07-2 (69) 
24-6 (62) 
49-1 (78) 
4(l-() (78) 
33-1 (78) 
46-0 (03) 
39-2 (48) 
76-6 (190) 
72-7 (162) 
103-7 (162) 
60-1 (160) 
81-5 (134) 
60-3 (59) 
81-7 (89) 
87-0 (39) 
84“-4 (54) 
73‘’-9 (64) 

177-6 (69) ! 
133-H{.67) 1 
92-1 (66) 1 
12K-6 (47) 
363-5 (48) 
298-8 (51) 
6(ll-4 (44) 
35-9 (42) 
28-8 (38) 
96-7 (47) 
lifl-4 (28) 
24-3 (26) 

48- 0 (.38) 
40-4 (33) 
33-4 (34) 
44-4 (27) 
39-4 (211) 
75-4 (60) 
72-8 (4?) 
104-3 (44) 

69-7 (49) 
80-fi (33) 

49- 7 (26) 
82-8 (31) 
80-fi{10) 
64‘'-7 (21) 
73'’-3 (21) 
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enough, however, to make comparison of interest. The following crude coeffi- 
cients of racial likeness,* using standard deviations for the Tomb 120 series 
alone, were found : 

Male, Tomb 120 (177-7t) and Tomb 107 (21-4) ... ~0'45±0-19 (25J). 

Male, Tomb 120 (202-1) and Tombs 108, 116 (16-0) ... 0-44 + 0-21 (20). 

Male, Tomb 107 (23-9) and Tombs 108, 116 (16-6) ... -0-18 + 0-21 (20). 

Male, Tomb 120 (107-3) and Tombs 107, 108, 116 (34-5) ... 0-41 + 0-18 (27). 
Female, Tomb 120 (116-3) and Tombs 107, 108, IIG (36-8) ... -0-20 + 0-18 (29). 

It is known from experience that no reliance can be based on comparisons 
of measurements for short series of skulls. Two series, each being made up by 
fewer than twenty specimens, and actually representing two distinct races, are 
quite likely to indicate an insignificant difference, and series of fifty or more are 
usually rc({uircd before any reliance can be placed on comparisons such as those 
given by (!oeffieiont.s of racial likeness. All of the five values above may be con- 
.sidered to differ insignificantly from zero, so the evidence, as far as it goes, 
indicates that the scric.s from all the tombs may have been random samples 
from the same population. Larger samples would be required to justify this 
hypothesis in an adcrpiato way, but for practical purposes there can be no 
objection to pooling all the material, on the .supposition that it represents a 
single population. Thi.s conclusion is in conformity with the archaeological 
evidence. 

It appeared worth while to compare the variabilities of two sub-groups of 
the total material. All the standard deviations given in Table V are for forty 
or more .skulls. Two of the differences for corresponding constants might be 
considered just signilicant if considered by themselves (<J L, d/r.n. /I = 3-4, and 
d /S’jj, 3-2, the Tomb 120 (^onstiuit being the greater in both these cases), but the 
conclusion for all characters must be that the two groups are almost identical 
in variability. 


* With the usual notation, tin; form of tho crmle cocllioioiit used iS! 


M 


\n„+«»- 


c - 


) - 1 ± 9 - 07449 ^ I E{a ) - 1 ± 0-97440 ^ 


2 

M' 


If Yi, iH tlu! moan numbur of skulls available for the characters used in tho case of the first series, 
and >7,. is the saiuo for the second series, then tho “roduced*’ coefficient is defined to be: 


60 


X -ii?-*' i’(a)- 1| ±0-67449 . X 50 X . 

7!,?v (Ml M n,n,, 


t This figure is the average number of skulls for tho series (5), in the case of the comparison in 
question, no means baaed on fewer than ten specimens being used, 

f The number in brackets following a coefficient is the number of oharaotora on which it is baaed. 
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TABLE V 

Standard deviations for series from Tombs 107, lOH 
and 116 combined, and from Tomb 120 




Male 

.PciiiaUj 

ClitttiLuUji 

Tomb 120 

Tombs 107, 108,110 

Tomb 120 

Tomb 107, 108, 111! | 

L 

0'03+0-18* 

4-82 ±0-31 

5-03+0-17 

.5-11) ±0-32 

B 

4-96 + O'M 

5-05 + 0-38 

.I-OK + O-IO 

3-95 ±0-2.5 

B' 

4-16 + 0-12 

4-19 + 0-27 

4-53 ±0-10 

3-89 ±0-25 

h; 

4-85 + 0-15 

5-73 + 0-42 

4-lH'f(l-I5 

-1-32 + 0-30 i 

Si 

5'97 + 0-18 

fi'84+0-40 

5-87 ±0-20 

5-13 ±0-32 

Si 

744+0'21 

5-96 ±0-41 

7-18 + 0-25 

7-02 + 0-4.7 ; 

Si 

7'04 + 0'22 

5-89 ±0-44 

6-50 + 0-24 

0-84+0-40 

PQ' 

9'64±0'29 

9-78 ±0-05 

9-47 ±0-33 

8-54 +0-.57 

U 

13*86-l-0-41 

11-65 ±0-79 

12-54 ±0-40 

12-52 ±0-!Hi 

100 BIL 

3‘02 + 0'09 

3-29 ±0-22 

3-01 ±0-10 

2-iU'±0-lK 

mii'/L 

2'88±0-09 

3-18 ±0-24 

3-02 ±0-10 

2-51 ±0-17 

Oc.f. 

2-62±0'09 

2-30±0-18 

2-51±0-0ll 

2-30 ±0-10 


* The symbol ± indicates probable errors tlmninlimit this pnixT. They wen* nwd irwlcnd of 
standard errors as probable errors have bMii given far' more frequently than Htandanl emirw in 
oraniometric studies. 


The data for material from all four tombs were pooled, ho that a Hingle 
Lachish series, subdivided according to sex and age, is considered in all com- 
parisons made below. It may be asked, next, wliether tlie total adult malB ami 
female series can be supposed to represent the same population or not. The 
constants for them are given in Table VI. 

_ The corresponding mean measurements of shape-'-indiceH and anglcH - for 
the two sexes are obviously very close. The differences only excecrl tlirca* tinma 
their probable errors in the case of: 

100 BjL (d/p.E. ^ = 6-7), 100 Bill' (4-6), 100 {B-H')/L (AU), (kJ, (T3), 

100 O^jOy, L (6-6), 100 SSISQ (5‘7), mfmblfnil (34), lOCl SjC, ml {0-3). 

In the case of the first six of these characters, it is normally found that tlia 
male and female means for long series show small differences of the stune signs 
as those observed for the Lachish series. Eor tlie last character there Is no good 
comparative material, but the longest series available, described by Woo (1037), 
shows a sex difference of the same sign. As far as can be told from these com- 
parisons, the male and female Lachish series represent precisely the same 
population. 



TABLE VI 

Constants for the male and female series of aihlt crania 
from LarMsh (all tombs combined) 



* Determined fwm rocoiwtturtion formuliw; boo Ajipondix 1. 
















126 Cranial and olhar Human Remnim from Palmtinf. 

The few juvenile and more numerous adoleseent skulls together, a total of 
sixty-one unsexed specimens, give the mean indices in the following talile, 
where comparison is made witli the male and female adult means. 


Male 


100 BjL 
100 H'lL 
100 JW 
100 {B-iniL 
100 finb/fml 
KH)NB/NH 

100 Oj/Oj, L 


7-t'3±0-l2 {3 10) 
72-7±0’12 (207) 
102'4±0-20 (250) 
1-7 ± 0-14 (240) 
82'7 + 0'2(i (222) 
4i)4±0-25 (114) 
7!)4 + 0-2H (141) 


Ji'cnidlc 


humiiUm; 


75-r)±0-1.3 (252) I 
72-7 +0-14 (200) ! 
lo;t-K± 0-24 (200) 
2-H+0-17 (202) 
S1-4 + 0-2S (ir,7) I 
nO-2 + 0-2K (K4) I 
H2-0±0-3(> (lOO) 1 


70-2 (.53) ! 
72-(I(41) ; 
1044) (30) i 
2-7 (3Kj s 
70-.5 (35) ' 

524) (22) i 
S3-8 (22) i 


The seciuenee, male adtilt— female aduli — immature (unsexed) mean, i.s ex- 
pected in the case of the last three of tlie.se indices. 'I'lie liisl four give, nu*an.H 
for the immature series which differ quite insignilieantly from the female iwlult 
means, and a close correspondence of this kind is expected. There is thus every 
reason to Ijolieve that the children ami adults helonged la the .same population. 

The sex ratios (male mean/femalo mean) for a few measurementH of size are 
given in Table VII for the Lachisli, throe Egyi)li!ui. and four Kogli.sli HoricH. 
All the means involved, are based on eighty-six or mure crania. 


TABLE VJl 


»S'ca' ratios for the Lacldsk and oilier series* 



Lachish 

Egyptian, 

E 

•Egyptian, 

Kerma 

Bgyjjtian, 

Dcndcrah 

English, 
Earring- 
don »St 

Engl ml). 
VVliitc- 
ehiqM'l 

EngliKli, 

.SpiUl- 

licldH 

B 

1-027 

1-026 

1-029 

1-031 

P049 

1-045 

l-IEK) 

S 

1-034 

1-034 

1-039 

— 

14)46 

1-039 

1-0.51 

U 

1-036 

1-038 

1-044 


1-044 

1-041 


B' 

1-036 

1-027 

P04() 


P03H 

1-053 

1-014 

IF 

Pl)42 

1-03K-I- 

1-044 

14)41 

P0.5i) 

P0.59 

1-0.53 

L 

1-043 

1-047 

1-046 

1-044 

t-OlO 

l-O-lH 

14)52 

LB 

1-046 

1-063 

P064 

1*047 

1-046 

I'Oiiu 

P0.5H 

1 


Kuk'IihIi, 

Hytlic 


!4».it) i 

14MO ! 

POSH I 

poll 

14)54 1 

1 03H I 
P0."i3 


* The aeries are: Egyptian B, Gizah, 26th-30fch dymwtits (Davin S. Ihutrwon. H>24); Kcnitit, 
12th-13tli dynasties (Collett, 1933); Bonderali. (ith-12th dynaslics (Thttitmuii & Maclvcr, UHi,5); 
EarringdonSt, seventcentli century (Hooke, 1920); Wiiitechapd, Hcwnlwulli ociilurv (i)Ift«loncli, 
1904); SpitalMds, Roman or medieval (Morant & Hoadicy, 11131); Hytlw, ms-dicivat (8t«*»di><f 
& Morant, 1932). 

t Eor B in place of II . These very similar calvarial heiglits give almiwt idoitieal sex r*ti«* in 
the ease of long series for which both arc given. 


The characters are arranged in the table in order of the sex ratios for the 
Laohish series. The three Egyptian scries give very similar orders, that for the 
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Kerma being particularly like the Lachisli, As is usually found, the pre- 
ponderance of the average male over the average female skull is clearly different 
for different measurements, the sex ratios being rather less for cranial diameters 
than for stature and the lengtlus of the long bones. It is curiouvs that the orders 
in which the (jharacters are arranged by the Lachish and Egyptian sex ratios 
are distinctly different from all those given by the constants for the four English 
series, while these last show a general, though not very close, correspondence 
inter se. There appears to he a distinct difference between the sex differences of 
the Lachish and ancient Egyptian races, on the one hand, and the later English 
ones, on the other. 

Measures of variability for the total male and female adult serie.s of crania 
from Lachish are given in Table VI. Comparison of the relative degrees of 
variation exhibited by the material for the two sexes may be considered first. 
For the thirty-three absolute measurements the male standard deviation exceeds 
the corre.sponding female value in twenty-four case.s, and for the remaining 
nine the position is reversed. In one nr two instances the excess of the male 
over the female constant i.s clearly significant, but where the female standard 
deviation exceeds the male the differences arc quite insignificant. In comparing 
parallel samples representing the same population it is customary to find a 
slight prepondcraivce of male over female variation fur size characters. A closer 
approach to equality is normally found for these if coefficients of variation are 
used. 

The following comparisons are luiscd on coefficients of variation for absolute 
(size) characters, and standard (hiviations for measurements of shape (indiocis 
and angles), the total number of cliaractters being fifty. In twenty-four cases the 
male measure of variation exceeds the female, and for the other twenty -six the 
position is reversed. The only difference which exceeds three times its probable 
error is for LB {Ajy.n. A = 3'2), but no importance whatever can be attached to 
a divergonco of this order in fifty comparisons. The male and female series thus 
show a remarkably chose appi'oachi to equality in variation, and the agreement 
in this resi)cct again suggests that a single population is ropreseatod. 

It may be asked next how the variability of this population compares with 
that of others which arc usually accepted as being racially homogeneous. It will 
1)0 sufficient to make cumpariscjns with the long aeries of 2((tli-30th dynasty 
skulls I’rom (.lizeh (Davin & Pearson, 1924), as these have often been used for 
the j)urp 08 e. There are thirty-three characters for which the constants required 
(coefficients of variation for absolute measurements and standard deviations for 
indices and angles) are available. In the cose of the male series the Lachish 
value exceeds the Egyptian if in fifteen cases, and the Egyptian if exceeds the 
Lachish in the remaining eighteen. Differences exceeding three times their 
probable errors are only found in the case of three measurements of shape, viz. 
100 BjL (ri/p.E. zl=s4-0, Lachish cr the greater), Az{4-1, Egyptian E greater), 
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and Oc.L (8' 3, Egyptian 0 greater). In tlie case of tlio female KerioH the Lachisli 
constant exceeds the Egyptian E in twenty-three ciuses, and the Egyihiiui E 
exceeds the Lachish in the remaining ton. The “Hignifi(!ant” diiTer(mw<8 are 
for BL (3-1), 5' (3-8). 100 BjL (4-2) and ,L£t (4-6), the Lachinh vnriahilit.y being 
the gi-eater for these four, and OcJ. (7-2), in whieli ciwe the Kgyfiliaii Err 
exceeds the Lachisli. It is clear that there is a close agreement lietwwm the 
variabilities of the two populations, and, in fact, it is not possible to say that one 
was more homogeneous than the other. No importance can be attached to any 
of the differences for single characters, except those for the occijiital index, 
which is decidedly more variable for the Egyiitian serie.s in the case of luith 
sexes. 

Comparison with data for other serie.s .show.s that the T^achi.sh .standard 
deviations of the occipital index are quite unexceptiniml, wliile those for tlie 
Egyptian E series are peculiarly large, in view of the fact that all other ciiaracters 
for it indicate alow order of variability. It may ho noted that a few of the 
Lachish crania (excluded from the series) are clearly artificially deformed. If 
artificial deformation had been generally practised to asliglvt degree, this would 
have been expected to increase the variability of the occipital index. Hnch an 
effect is not observed, however, and it is very prohahle that the ('.xcepf ioiial cases 
noted are the only ones exhibiting artificial deformation. 

All the statistical evidence thus points to the fact that the, male ami femalti 
adult and immature Lachish series represent random .samjiles from a single 
population, which showed the same order of variability as aneiont lOgypUaii 
populations. It is known that these were sliglitly less variable than Western 
European populations in later times. 


6. The RBEATIONSHIl^S OP THE LaCHISH AND AnOIENT EoYinTAX I’OI’tfbATIUNS 
JUPGED PHOM COMPABISONS OP MEAN MHA.SUBKMENTH 

The Lachish is the first series of skulls of any period from falestine which is 
of an adequate length, and for which adequate measurenumtH arc availahkn 
A rough comparison of mean measurements for it showed that tlm !y}fe is very 
similar to that of certain ancient Egyptian, series, and tliis was fully Huhstiu 
tiated by statistical comparisons. As far as can be judged from the appearanm 
of the skulls, no clear distinction from dynastic Egyptian material is nvitleiit. 
Comparisons with Egyptian series only are dealt with in this section, and 
reference is made to other material from Palestine in § 9. 

Par more skulls have been preserved and studied from an anthropological 
point of view from Egypt than from any other country in the world. The vast 
majority of these relate to Roman and earlier times, and no comparisoiifi with 
the meagre post-Roman material from the country were made. Most of the 
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longer aeries of male Egyptian skulls representing populations of the period 
considered have been compared statistically by Morant (1926). A selection of 
the material with which he dealt was made, and certain series were excluded 
for various reasons, viz. : 

(i) Eouquet’a predynastic~the so-called Aeneolithic— series from Middle 
Egypt. The male means of size characters for this series are decidedly larger 
than those for any other predynastio series dealt with in Morant’s paper, or 
described since, while its mean indices are very close to those for some con- 
temporary series. It appears to be very probable that the distinction was due, 
either to a process of selection which favoured the larger specimens, or to 
inaccurate sexing. While the series is suspect it appears better not to use it for 
comparative purposes. 

(ii) The series measured by Broca and Chantre, for which the means are 
unsatisfactory as they are not recorded in fine enough units. 

(iii) Four other scries, each comprising fewer than thirty skulls. 

(iv) The El Kubanioh South scries, measured by Toldt, which relates to a 
period covering Early and Middle Dynastic times, while all the other series 
relate to shorter intervals. 

The modern Abyssinian series measured by Sergi, and reduced by Morant, 
is included in our list. This accounts for eighteen of the series used. The three 
other aiudent Egyptian series included are the 9th dynasty from Sedment 
(Woo, 1930); the Kerma (Collett, 1933) — from a Nubian site, although the 
population was unquestionably of Egyptian type — and the pooled early pre- 
dynostic Badari series (Morant, 1935), measured by Stoessiger (1927) and 
Derry. Including the Lachisli, there is thus a total of twenty -two male series, 
made up by about 3000 skulls. Tlic localities from which the collections were 
obtained are shown in the sketch-map. Pig. 2. 

lloduc.od coefficients of racial likeness betv'een all pairs of the twenty-two 
series are given in Table VIll. Many of these wore obtained from the crude 
coclIicientH given by Morant (1(126), and others W'cre copied from tables given 
by Woo (1930), and Collett (1933). In addition to these I was able to use a 
number of unpublished values calculated by Morant, and to complete tlie 
table I calculated all the coefficients with the Laohish series and several others 
which wore wanting. The n for a particular series in the table gives the average 
number of skulls on which the means are based, in the case of all the characters 
used in computing the coefficients which are available for the series. The ii’s 
range from 29-9 to 865*4, and there are only six, including the Laohish, greater 
than 100. Experience lias 8ugge.sted that a series of fifty complete skulls of one 
sex, or a larger number of incomplete specimens providing an equivalent 
amount of evidence, is required in order to give reliable racial comparisons with 

Hiometrlka xxxi 9 
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tig. 2. Sketoh-raap of E^pt and neighbouring countries shtmdng localities 
irom which cranial series were obtained. 
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a particular population. Of the twenty-two series, there are eight which fail to 
satisfy this ideal re(]uireraeiit, hut these may he supposed large enough to give 
fairly reliable comparisons, 

When po.ssihle coefficients of racial likeness between cranial series have been 
based on a standard set of tlnrty-one characters, nineteen of these being ab- 
solute measurement.s (chords and arcs), seven indices, and three angles. For the 
Egyptian material considered it was only possible to use the complete set in a 
few cases, owing to the fact that one or several of the measurements are not 
available for the majority of the series. In all cases as many as possible of the 
selected charac!tcr.4 were n,sed in computing the coefficients. The distribution of 
the numbers of characters on which the 231 coefficients could be based is ; 


No. ot clmracitera 


24 

25 

26 

27 

28 

2Q 

30 

31 

No. of oomparisouH 

Ktr. 

2 

f) 


6 

16 

14 

7 

12 


More than half of the coeilicicuts are thus ba.sed on only fourteen characters, 
all these being coinparisonH of the ten series measured by Thomson and Maclver 
with one another, and with the remaining twelve series. The coefficients between 
these twelve are all based on a number of characters which can be supposed 
adociuate, hut it may he 8ugge.sted that comparisons between them and those 
for which only fourteen characters are involved are likely to he unreliable. The 
question of how far the generalized estimate of resemblance for fourteen 
charactera is likely to diverge from that obtained from about twice as many was 
investigated by computing some of the more reliable coefficients for the in- 
complete set of fourteen characters available for Thomson and Maclver’s series. 
Comparisons of tlie corresponding reduced coefficients for (a) all of the characters 
available, and {b) for the fourteen cliaracters only, are made in Table IX, in 
tlio (!ase of iifteem pairs of series selected at random. 

For seven of the lifteeu comparisons the reduced coefficient for fourteen 
(diaracttUH is greater than the eorresiionding value for the larger number, and 
markedly greater in four ca.ses; for the remaining eight comparisons the position 
is rcvcirsed, hut there are no marked differences between the pairs. It appears 
that the gciKU-alizusl estimate of divergence derived from the fourteen characters 
(!an usually l)c cjxpectcd to give a fairly close ajiproxirnation to that which would 
bo obtained from about twice txs many characters, but occasionally it will 
suggest a rather misleading conelusion. In the ease of the material considered, 
the fourteen tiharacters show a tendency to indicate a wider separation of the 
types than that likely to he found if more evidence were available, and this 
might have been anticipated from the fact that they include the characters 
which normally show the most significant differences in the comparisons of 
Ancient Egyptian material, viz. B, IQO B/L, J and NL. 
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TABLE IX 

Comparisons of correspofiding pan's of 7'PjInced mcffide.nis of rarial likeness {mole 
series) for all coeficieni cJiaraders available and for a set of fonrlem rJuiraeters 



Iledut’od ets'ffieiciitH 
of riieia! likenesH 

All 

eliaraelcm 



14 

I'liaraetera 

Laehish (1!)1'5*) witli Tliebe.s, ISth -SOtli dyn. (.'it-l) 

2-I3:MI-23(2ri) 

O'Ol MUffi) 

Naqada A and Q (IM'!)) with K1 Kulianicdi North (IhM) 

:!-!ll+(WI (28) 

B’41 ytr.AH 

Abydos, 18th-li)thdyn. (Ill-ri) with 'I’liobos, 18th -2l)th <lyn. (fit-l) 

.Tl.'MD-lH (211) 

K-7li ( O-tM 

Laehish (lOlffi) with Thebes, 18th -2 Ist dyn. (I(t7-1)) 

r.*27,!;tl-ll (2B) 

:b23i:ii-14 

Lauhiah (]91'5) witli (lizeh, 2(ith -3(lth dyn. («(iO*f)) 

.q'77 + li'llli (2!l) 

o*37,i O'OH 

Laehish (Ifll'fl) with Abyssinian (fi2*(i) 

B*Hi)tl)-2l){2li) 

4*2r> ^ 0'27 

Laehish (llll'r>) witli Kerma (107'(1) 

7*SB + fl-13 (2!)) 

7’7H;MM'.I 

Abyssinian (Iffi-O) witli El Kubanieh North (33*1) 

8'I)k'+I)-11 (25) 

re.lB KI'oSI 

Sedment (37'r)) with Thebes, lHtli-2lHt dyn. (l(17-lt) 

!l*lKs(b2!l{:il)) 

lMffi AO-42 ' 

Thebes, 18th"20th dyn. (fit* I) with Abydos, Ist dyn. Royal (33-2) 

12'fl3 (2!l| 

lO'lffi 1 0'li2 ! 

Abys.sinian with Abvdos, 18Ui-19th dyn. (31*5) 

)3*()l:M)'lli(25j 

I!I;:I7MI'IU 

Sedment (37T)) with Abydos, IHth -lfitli ilyn. (Sift) 

13'12n ii'iW^UH) 

I3'!I7 i,0'7ri 

Naqada A and Q (fl-bl)) with (iizoh, 2(ith-3(lth dyn. (Hfi!)*,'>) 

i4';ir.i.iM4 (30) 

ID'Ori Mb2I 

Laehish (iOl'S) witli Ihulari (BM) 

2*M2;MI-22(2!I) 

2Mi7 • 0-27 

Badari (BM) witli Abydos, Ist dyn. Royal (33*2) 

;)B'2.1;MI"M (31) 

4.r33 '»■ OTill 


* Tho luimberH in bnio.kats fnlldwing tho titk'fl of Ihc w*ri<‘« givo tlto avornjio tmiiilM'w iif nkullK 
(H’s) in tho oaso of tlio foiirtoon cooffloieiit of nieial likonosH ohanictor aviiilalilc for all the wri'ii’M. 
In the case of tho cmnpariHonH for “all charaotortf’’ tho «’« aw oliaitKod, but llioy aiv invariably of 
the samo ordor. Fuller partieulara regarding tho aeries are givtui in 'I'alile MI L 

The point in question can be oxaniined in an indireet way by (‘otjiiitiriiif; tint 
distributions of reduced coefficients in Table VIII based (n) un fourteen ehu- 
racters, and (/;) on twenty -four to tliirty-onc eharaeters. 'I’lieKt! are: 



In- 

signi- 

ficant* 

Bigni- 

fieaut 

and 

<3-r) 


6-13 



13- 20 

20 30 

30 40 

40 .10 

‘IoIiiIh 

14 charaotors 

!) 

10 

18 

72 

27 

17 

n 

] 

Jiffi 

24-31 obaractors 

0 

4 

6 

31 

17 

r, 

2 

1 

BB 

All comparisons 

9 

20 

24 

103 

44 

22 

7 

2 

231 


* A coefficient is counted hero as insignificant if it differs from zero by less than 3*5 limea itj 
probable error. 


Allowing for the differences in their sizes, these two distributions appar tc 
be very similar, and it is clear that the coefficients for fourteen charatjtors onlj 
do not show any marked tendency to diverge from the others on tiie avtjrage. 
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It iw curious that there are no insignificant coefficients based on twenty-four 
or more characters, but little importance need be attached to this fact. 

In computing all the reduced coefficients of racial likeness given in Table 
VIII, the Egyptian B standard deviations (Davin & Pearson, 1924) were used. 
It i.4 thus assumed that the variabilities of all the populations are so closely 
similar that any one of them can be adequately represented by the constants 
for the longest series of the group available. It has been shown above that the 
Lachish standard deviations are very similar to the corresponding values for 
the if series. The constants for three of the other series — viz. the Badari, Sed- 
ment, and Kerma — ^have been compared with the Egyptian B in the same way, 
and it was found that the variabilities indicated by them are also of almost 
precisely the same order. At the same time it is known that the Ancient Egyptian 
series show a clear tendency to be rather less variable than Western European 
series of later times. 

In the comparison of any two aeries tlie full coefficient of racial likeness 
involves both seta of standard deviations. The question of the extent to which tlie 
a,4.sumi)tion that tlie population variabilities wore precisely the same is likely to 
affect reduced coefficients was examined in an indirect way in the case of a few 
(toini)ariHonH. For twenty-nine of tlie comparisons reduced coefficients happened 
to be available computed by using both (a) the Egyptian B it’s, and (6) the 
Farringdon St (sevcntcenth-century London, Hooke, 1926) rr’s. In every case the 
former value is greater than the latter, as would have been anticipated, and the 
ratio of the reduced coefficients found with Egyptian B cr’s to the corresponding 
values found with Farringdon St cr’s, range from l-or) to 1'79, while for 21 of the 
29 ratios, the range is between 1'3 and l-C. The Farringdon St variablities are 
obviously inappropriate for use in comparisons of Ancient Egyptian material. 

For the ten cases giving the largest ratios the reduced coefficient was also 
computed using the Laebish c’s, and the values for the three sets are given in 
Table X. The ratios of coefficients found with the Egyptian E cr’s to the 
c.orrespfmding values found with the Lachish ir’s range from 0-91 to 1-17. For 
.seven of (he ten comparisons the use of the Laidiish u’s gives the lower value, 
hut all the differences here are so small that they may be considered of no 
importance. 

The reduced coefficients of racial likeness which were used in this paper to 
provide a classification of a number of Ancient Egyptian and related populations 
are clearly not statistical estimates of an ideal kind. Owing to the nature of 
the material, certain devices have to bo used in practice which are likely to have 
an appreciable affect on the coefficients obtained. It has been shown above that 
there is good reason to believe that the use of a constant set of standard devia- 
tions does not distort the value to any appreciable extent. The use of different 
sets of the complete group of thirty-one characters appears to be of far more 
practical importance. More than half of the 231 coefficients computed are 
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actually based on fourteen cliaracterH only, the numbers hir (he remiuimhu- 

ranging from twenty "four to thirty-one. ComparisonHVviththeshorter.sciriesniiglit 

have been omitted, but this would have reduced the evithmen to .such an extent, 
that it appeared better to inelude them, while remembering tlitdr im]i(irff’ctioti. 
There are other objections to the conatantB. It is known tliat the theoralical 
requirement that all the oharacterK used should be uncorrohited with one another 

TABLE X 


Comspmding reduced coefficients of racial likenmfnr male cranial serm compute, i! 
by using (1) the Egyptian E, (2) the Lachish, and (3) the, Faninifdon Et standard 
demlions 


■ 

No. 

With 

VVitli 

With 


of <'l)a- 

Egyptian A' , 

Laeliish 

Fitrriiigdon 


meters 

fr’H \ 

/f M 

.St, ff's 

Ahyasiniana with Abydos (Eurly [irudyti.) 

14 

12'll«l;lK'i2 

lll’70±li','i2 

"dll 4IK02 

{Ti m) (Ft 40-d) 





Abyssiniana with El Ainrah and Hou (Lato predyn.) 
ft62'6) («10()'(i) 

El Amrah and Hou (Late predyn.) witii 'i'hebea {18--2(ldyii.) 

]4 

7'()4itl'32 

(i'.'iOt lldtt 

... 

4dildlK)2 

14 

KMIStiKid 

K'K/i:,tfKiti 

6'lil ±t)’3(i 

(nl05‘6) (/(fl4-t) 

Abydos (Early predyn.) with Abydos (12-10 dyn.) 

14 

4'!lil±t)'61 

ddMiirOi 

:t't3j:lh61 

(S4fl-9) (?i65'9) 

Naqada A and Q with Abydos (12-15 dyn.) 

14 


2d)!l + lt':ii) 

l'33;i;lt3i) 

in 64'9) (?i 65'9) 

Thebes (18-20 dyn.) with Thebes (18-21 dyn.) 

23 

1'43±0'24 

l-r>7±(t24 

0'iKlit)-24 

(S 63’1) (n ie04) 




Naqada A and Q with Thebes (18-20 dyn.) 

27 


74)8±(b»(l 

6- 18 i (MO 

(B 66’9) (5 63'4) 

Thebes (18-20 dyn.) with Egyptian E 

26 


3-4.6+(tl« 

Hrt±(M8 

(B 53-2) [n 8604)) 




Denderah (Roman) with Abydos (12-15 dyn.) 

14 

(b81±0'45 

6'K8ill‘.l,5 

4-34 + (M6 

(i49-3) pT65'9) 

Thebes (18-20 dyn.) with Deshasheli (4-6 dyn.) 

14 

2’3fl + ()'66 

1‘42±()'50 

2'26 + tK6fl 

(iii54'l) (it39-9) 



is not fulfilled, some pairs of the characters being almost corlaiiily quite highly 
correlated in all the series, but it is probable that the rtdative values of the 
estimates of resemblance are not more disturbed by tliis than by some of tlio 
other factors. In the case of a particular measurement used, (simjMaiKHns are 
only made between mean readings said to have been obtained by iidlowing a 
particular definition of the measurement. Even so, the personal equations Of 
differeirt measurers may have been large enough, in some cases, to disturb the 
results quite appreciably, It is not possible to investigate this matter in the 
case of all the material used, but it must be recognized that the disturbance due 
to it is probably quite sufficient to invalidate any rigid application of statistical 
rules to the material. Mistakes in sexing the crania provide another posKiblo 
source of error, though probably this is one of minor importance. 
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It is not at all likely, of course, that the extent to which a particular coeffi- 
cient is distorted is determined by the accumulation of the various kinds of 
disturbing factors mentioned. One may hope that they will usually counteract 
one another. In view of all the circumstances, it is obvious that no great reliance 
can be placed on the numerical accuracy of any particular reduced coefficient 
of racial likeness, or on the comparison of any pair of the constants. It would 
be mo.st inadvisable, for example, to compare the difference between any pair 
with regard to the probable error of this difference. It is not unreasonable, 
however, to accept the coefficients as indicating different, rather broad, grades 
of resemblance, and it is in this sense that they are used for purposes of classi- 
fication here, in conformity with previous biometric practice. 

The classification of the series treated in this section is based on reduced 
coefficients of racial likeness for pairs of the male series : the female series are in 
most cases shorter, but they may be expected to lead to very similar conclusions, 
Corresponding male and female values for the Laohish compared with six other 
aeries are given in Tabic XI, For the first of these comparisons the difference 

TABLE XI 


Cormpondmj male, and, female reduced, coefficients of racial likeness for 
comparisons of the Lachish with other series* 



Abydos 

18 th dyn. 

Thebes 

18th-20th dyn. 

Abydos and Hou 
12th-15th dyn. 

tJ(40-0) 9(C0-I) 

d(53-8) 9(43-R) 

^(66-9) $(87-4) 

Latilliflh 

(j 

(1H4-0) 

V 

(147-1) 

l-2fl+0-32 

(14) 

1 -or) + 0-28 

(14) 

2-13 + 0-23 
(24) 

e-37+0-29 

(24) 

4-62 ±0-26 
(14) 

2-51 ±0-23 
(14) 




Egyptian IS 
2titii-30th dyn. 

Denderah 

Korma 



(1th -12th dyn. 

12th"13th dyn. 



.1 (H59-fi) V (fiOtb?) 

d (108-0) 9 (140-4) 

(J (107-6) $ (84-2) 


3 

f)-74-t-0-00 

6-f)7±0-14 

7-86±0-13 

Lachish 

(184-0) 

9 

(29) 

(14) 

(29) 

0-87 + 0-08 

8-07 ±0-17 

12-86 ±0-17 


{147-1) 

(29) 

(14) 

(29) 


* The Borics comparacl in this table with the Lachisli, are ones for which fuller particulars are 
given in Table VIII. The numbers in brackets after the sex signs are the S’s for the numbers of 
skulls on wliich the inenns used in calculating the ooofficienta are based. The ft’s given for the 
Lachish series are for the twenty-nine oharaoters on which two of the comparisons are based, and 
its values for the other groups of characters are very close to these. 
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between the male and female coefficient is of no acctount, hut lor Home, of the 
others the difference is appreciable, even in the case ot the Lachish comparcfl 
with the Egyptian E aeries, for which all the means involved are based on an 
adequate number of skulls. The estimates of resemblance are in fact rather 
different for the male and female (iompariaons, though there is a fairly (do.si! 
agreement in their orders, except in the case of the iiomparison with the iHth- 
20th dynasty series from Thebes. The moans for this scries suggest that the 
groups of male and female skulls composing it may not have represented pre- 
cisely the same population, and any slight discordance of this kind is liktdy to 
have an appreciable effect on the coinpaiisons considered. Errors in se.xing are 
also likely to affect them to some extent. VVe are again fed to the concluKion that 
there is no justification for attaching any importance to small clifTcrenee.s between 
coefficients of racial likeness. 

The conclusions regarding the racial aflinities of the populaticms reproHcntcd 
by the twenty-two Ancient Egyi»tian and allied Heri(!K may now be considertal. 
Eeduce^ coefficients of racial likeness between all pairs of them are given in 
Table VIII, and the arrangement suggested by the lowest orders {)f these con- 
stants is shown in Eig. 3. Three .sets of series, rcprcscjiting races in other jiarts 
of the world, have been treated in the same way : the results for a North Amt*riean 
set have been published by von Bonin & Morant (liffiK), for an Asiatif^ sot by 
Woo & Morant (1932), and for a European set by Morant (I92H), For this last 
material crude coefficients of racial likeness only arc given for forty-one serfes, 
four of which relate to North African peoples. I am indebted to Morant. for the 
corresponding reduced coefficients, which he lias caleulatcd but ^\'hieh have not 
been published. The four North African series were exclurled, so the numerical 
results to be considered refer to thirty-seven European series. 

It is of interest to compare the distributions of the reduced coefficients for 
the new Egyptian and the three other sets. These are given in 'fable XI I, There 
are marked differences between the ranges of the constants for the four sets of 
material. On the average the Egyptian shows much lower values than the other 
three sets, the highest reduced cioefficient for it being 47-fi, while f<»r cjich of 
the others more than one quarter of the eoofficients exceed 30 atid several values 
greater than 100 are found. 

The relationships of the modern races of man are of such a kind that tluiy 
can be roughly divided into groups, so that nil the members of n particular 
group show close resemblances, while there are links between the diflcrcnt 
groups. This conception of constellations of races, os it were, which arc joined 
to one another is undoubtedly useful, though it is often difficult to give precision 
to it, owing to the existence of subgroups within the major groups. The distri- 
butions in Table XII suggest that the selected set of twenty-two Ancient 
Egyptian and allied series may be considered to represent a single constellation 
of populations which had very similar types, while each of the other three sets 
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represents two or more constellations. The Egyptian series actually relate to 
peoples who lived from early predynaatic to modern times, a span ot ahout 
7000 years, hut there is no suggestion that the group changed radically within this 
period either as the results of evolutionary chang(!, or owing to infiltration hy 
alien peoples. The area represented extends from Ahyssiaia to Palestine, tlufugli 
nearly all the material comes from Egy])t. 

TABLE Xn 


DistrilMions of rethiced coeffioienls of racial Uhnmfor 
various gronjjs of male, cranial series 



No. 



(if raciiii likwu'Sfl 

i 

^ *rt)tai 

Groups of series 

of 






-n- i {lOJll* 






2(HI 4(KI ^ 400 430 ' I®'’"*'”''* 


series 

<fi 

.5-13 

13-.50 

,50-l(H] 

UKl 2(«l 

Ancient ERyptinn 

22 

5,3 

103 

75 



S ! 

; ■ ; 231 

and allied 


(2m,) 

(44'6%) 

(32'5%) 



1 

European 

37 

33 

102 

362 

135 

34 

■ ’ 0(10 


(r)'()%) 

(15'3%) 

(54-4%) 

{204t>!;,) 

(5'1%) 

i ! 

North American 

16 

2 

9 

51) 

45 

5 

• ; - ) 120 

Indian 


(1'7%) 

(7-5%) 

(41)-2%) 

(37-5';.;,) 

110 

(4'2'!;,) 

' i 

Aaiatio 

26 

7 

23 

102 

5(1 

211 1 1 ! 325 



(2'2%) 

(7'1%) 

(314%) 

(33-8%) 

(17-2%) 

(H'O';;,) i cKi : 

1 “ 


It is known from experience that in tlie cross eomparisons ol' tsvo grruips of 
cranial scries, representing two distinct families of racics [A and B), some lower 
reduced coefficients will be found between them than some for jtairs of series 
belonging to group A, or than some for pairs of series lielonging to group B. In 
comparisons of the Ancient Egyptian with North American Indian series, for 
example, it would be anticipated that a certain proportion of the coetTieients 
would be below 60, though the majority would doubtless exceed tins limit. Tim 
peculiarity of the generalized measure of resemljlancc employed must be sup- 
posed due to the peculiar nature of the material, and it need not Im taken to 
indicate any defect in the method. A classification l)Mod on such giumralized 
estimates of resemblance must evidently proceed on certain lines, and tius way 
in which the constants can most usefully be employed for the purijost? has io bo 
determined empirically. 

Some of the cross comparisons AxB referred to give lower redimed eoeffi- 
cients than some for pairs of i’s, or some for pairs of /i’s, but at the same time 
they have always been found to exceed a certain limit, so that the lowest order 
of coefficients are not represented by any AxB comparisons. Tin’s suggests 
that any classification of the material ought to be derived solely from coefficients 
less than the limit in question, i.e. from the evidence of close resemblance only, 
such as is never found, as far as experience goes, between any two populations 
belonging to distinct families of races. 
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The earlier studies of groups of cranial series treated by the method of the 
coefficient of racial likeness suggested that the limit in question might be taken 
as a reduced coefficient of about 20, so that the classification would be based on 
all values less than 20, while all greater than this limit would be neglected 
entirely, Recent experience of the cross comparisons between North American 
Indian and Asiatic series (von Bonin & Morant, 1038) has shown that it is 
safer to employ a considerably lower limiting value, which was provisionally 
taken at 13. If the particular set of material treated is made up by a consider- 
able number of very closely related series, a clearer picture of the interrelation- 
ships is obtained, however, by taking a still lower limiting value. 

In the present case the limit was chosen arbitrarily at 6‘0, that is to say the 
evidence presented by all reduced coefficients less than fl-O is taken into account 
— bringing in 23 % of the total of 231 comparisons — while all values greater than 
fid) are neglected, It should be realized that a reduced coefficient less than 5'0 
represents very close similarity of the cranial types compared, such as has 
hitherto only been found in cases where close relationship between the popu- 
lations represented was anticipated, from the cultural (historical or archaeo- 
logical) evidence. 

'I’lie gradG.s of resemblance indicated by reduced coefficients of racial likeness 
less than 5d) are shown in Fig. 3. The full lines represent “insignificant” coeffi- 
cients — i.e. values which differ from zero by less than 3-5 times their probable 
errors. In those cases the two aeries might have represented precisely the same 
population, as far as can be seen from the direct comparison, though, owing to 
the imperfection of the material, no stress can be laid on the difference between 
an “insignificant" coefficient and one of the same order which indicates clear 
differentiation judged by the ratio of the constant to its probable error. The 
second grade of resemblance is represented by coefficients which differ signi- 
ficantly from zero, but which are less than 3*5, and the third by values between 

and b-O. 

It should 1)0 stressed that any coefficients of racial likeness less than 6-0 
indicate a very close rosomblauco of the cranial typos compared. There are two 
adocpiatcly long male series of seventeenth-century London skulls, from White- 
chaiKil fuid I''arringdon iSt, anfl the coefficient between them is 3'8 (see Morant Sc 
Hoadloy, 1931, p, 233), so a value of this order can only be taken to show the 
degree of divergence which may be found between parochial subgroups of the 
same population, 

Tiie arrangement shown in Fig. 3 was arrived at by placing the series, as far 
08 possible, in position relative to one another so that their distances apart are 
proportional to the estimates of resemblance considered. One of the series 
treated— viz. that of the 9th dynasty skulls from Sedment— is not shown, 
because the lowest reduced coefficient found with it is 5-1 (with the 4th-6th 
dyrifisty series from Deshasheh and Medum). 
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Cranial and other Human Rmaim from PaleMme 

There is a clear suggestion of a division into two groups. 

Group A includes eiglit aeries from sites in Upper Egypt (covering a stretch 
of the Nile, about 100 miles in length, from Badari in the north to 4’liebes in 
the south—see Fig. 2), a .serie.s from El ICubajiieh (100 miles south of Thebt's), 
and a Nubian series from Kerma. The perio<l repre,sentcd by these, ten .series 
extends from early predynastic times to the IHth dynasty. 

Group B includes seven series from the same area in Uiiper Egypt, two from 
Lower Egypt, one from Abyssinia, and the Lachish Iroin I’ale.stine. The Upjier 
Egyptian series here range in time from tlie 1st dymusty- only represtmted by 
a single series of skulls from Royal Tomlm— to Roniiui times, 'riio, t,wo l,ower 
Egyptian series and the Lachish (of about 2r>tli dynasty date) fall witliin the, 
same period, and the Abyssinian seri<!s i.s of modern date. 

The poimlation of Upper Egypt is represented by lifteeu of tht‘ twenty -two 
series, representing a time sequence from early predynastic to Roman times. All 
the series of earlier date than the I8th dynasty except one fall in group A. and 
show very close interconuoxions. At the same time there is a clear suggestion 
that the type wa.s changing gradually with time, tlui (earliest predynastic 
(Badari) series being at one extreme, and the middle dynastic .series at tlie other. 
The only aberrant Upper Egyptian series of a date prior to llm bstli dytiasty 
is one from Royal Tombs at Abydos, and it 3nay })« suppo.sed tliat tlu'.se reprt*- 
sent an intrusive group eomiug from Up))er Egypt, 'rim fact that llu^ early 
Upper Egyptian typo persisted almost unelianged until the 18th dynasty is 
evidenced by the .series of this date from Sheikh Ali. 

There are four Upper Egyptian series of the lHth~'21st dyiuvsties - two from 
Abydos and two from 'J’Jiebes— wliich are quite distinct fnmi the tipper Egyp- 
tian (A) group of series, and which must hence be supposed to represent mainly 
an intrusive population in the region. Immediately before the I8th dyiuwty 
the whole of Egypt was under the Hyksos do3ninion, and wlitm tin's japsed at 
the end of the 17th dynasty the country was in an unsettled state. It may Im 
noted that if there were immigrants into Upper Egypt at tliiK time, tliey wcri' 
most likely to have been of the ruling clossos, who would liave settled and been 
buried in the principal towns in Upper Egypt, via. Abydos and Tludws, 'Dm 
population of Upper Egypt in later times is only reju'esented by two series from 
Denderah, one of Rtolemaie and the other of Roman dale, 'fhe types of these 
two are very similar to those of the presumed intrusive grotips of the iHth 21st 
dynasties, though they stand somewhat closer to the indigenous Upper Egyptian 
types. This may obviously he explained as being due to some slight degree of 
intermixture between the intrusive group and the settled population of lJ|)per 
Egypt between the 18th dynasty and Ptolemaic times. 

If an explanation of this kind is on the right lines, then it becomes necessary 
to discover the source of the people who are presumed to have migrated to 
Upper Egypt about the time of the 18th dynasty. The relationships of the 
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cranial series in Fig. 3 provide a clear answer to this question, as they show 
intimate connexions between two late Upper Egyptian series and two of 
the three Lower Egyptian series available, viz. the 4th-5th dynasty series from 
Deshasheh and Medum, and the 2fith-30th dynasty from Gizeh, forming with 
them what is here called the B group. 

Unfortunately, the evidence available is quite inadequate to provide an 
outline of the racial history of Lower Egypt. A third series from that region, 
viz. one from Sedment of 9th dynasty skulls, is not shown in the diagram, 
because its lowest reduced coelHcient exceeds 6'0; this is 5'1 with the 4th and 
5th dynasty series from Desluusheh and Medum, sites close to Sedment. The 
three Lower Egyptian series are very similar to one another, and they stand on 
the same side, as it were, of all the Upper Egyptian material. Another factor 
which has to be taken into account in interpreting the evidence is that the Upper 
Egyptian type was apparently becoming gradually modified, prior to the 18th 
dynasty, in the direction of the Lower Egyjjtian type. The evidence suggests 
tlic following general conehi,sions: 

(I) It may bo supposed that originally the populations of Upper and Lower 
Egypt formed two distinct groups, the i)urcat representatives of these known 
being the Early predynastic frhm Badari, and the 4th-5tii dynasty series from 
Deshasheh and Medum. 'I’hese are the earliest scries available from Upper and 
Lower Kgypt, I’espioctively. For convenience, the groujismay be called, following 
Morant, the Upper ((iorresponding to our Group /I), and the Lower (Group B) 
Egyptian types. From the earliest times until about the end of the 17th dynasty, 
the type f)f the llpper Egyptian poi)ulation hec^amo gradualbj modified in the 
direct ion of that of tlie Lower Egyptian. I'liis may be supposed due to a gradual 
infiltration of Lower Egyptians into Upper Egypt. The relationships of the 
Kenna serie.4 from Nubia are of partbuilar interest in this connexion. It is of 
12th d3tb dynasty date, but its (dosest connexions are witli two predynastic 
series fr(uu U})pei' Egypt. Hence it may be supposed that the Korma people 
r(iprim('ut the deseendants of cohmists who left Upper Egypt in predynastic 
times, and that this stotik remaiuod stable, and was not modified by inter- 
mix! tire, tlioiigh the parent group itself was (dianging owing to contacts with 
the North. 

There is only one Upper Egyptian series of earlier date than the IBth dynasty 
which stands apart from the oonstollation formed by all the others. This is of 
1st dynasty skulls from Royal Tombs at Abydos, and as their type is very 
similar to the. Lower Egyptian, it may be supposed that they represent an 
intrusive group from that region. Tliia group, which was probably small, may have 
been absorbed in the Upper Egyptian population without affecting the type of 
the latter appreciably. 

(2) The situation in Upper Egypt appears to have changed radically in the 
18th dynasty. Tliere is one Upper Egyptian series of this date (from Sheikli AJi) 
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which is very similar to the earlier series from that region, but foui others 
(ranging from the IHth to the 2lKt dynasty) stand (juite apart and sliow close 
relationships with the Lower Egyptian series. The movement from I.ow(;r to 
Upper Egypt appears to have l)een greatly accelerated about the time of tlu! 
18th dyna.sty. The evidence suggests that at this time it virtually became a 
peaceful invasion of Upper Egypt, wliicli resulted in the almost comphde dis- 
placement of the earlier population there. This view is supported by the. fact 
that there are no series whatever later than tlie LSth dynasty of the early 
Upper Egyptian type, while two late series from Denderah are still of Lower 
Egyptian type, though somewhat closer to the early Upper Egyiitian than are 
the 18th--21st dynasty series fn^m Thebes and Abydos. These relatisiuslups 
suggest that the prevailing population in Upper Egypt after the IHtli dynasty 
was of Lower Egyptian origin, and that it became mixed to some, extent with 
descendants of the earlier population of the region. 

The fact that a modern population from the north of Abyssinia stands 
between the two Ancient Egyptian groups is interesting, but far more evidence 
would obvionsily be required to elucidate the signiileamto of this relationship. 

The relationships of the Lachish series may now be considered. All its 
closest connexions are with series of the Lower Egyptian type, and thests are 
close enough to suggest that the population in the Pak^stinian town was cut irely, 
or almost completely, of Egyptian origin. Tluj .skeletal material from JiUehish is 
believed to represent people who died about 700 n.c;., which is the time ttf the 
25th dynasty in Egypt. The series shows closest resemldance, however, not to 
the 26th-30th dynasty series from Gizch, but to the early dynastie series from 
Lower Egypt (Desliasheh and Medum), and to three Upper JOgyptiau series of 
18th dynasty or later dates which are assumed to represent populations of 
Lower Egyptian origin. There is also a rather less close connexion between the 
Lachish and one of the Upper Egyptian types (Abydos and Hou, 12th - lOth 
dynasty). These relationships suggest that the, Lachish population nipresents 
descendants of a colonizing group of men and women, which was derived 
primarily from Upper Egypt, at some time later than tlic IHth dynasty, and 
which maintained its type unchanged— free from intermixture - until 7(H) h,o, 
or later It is not at all unlikely, of course, that a colonial group in Fahwtine, 
such as the one which settled in Lachish, originally included petsiilo from sovasral 
parts of Egypt, and not from a single restrioted locality. The chartteteristies of 
the Lachish sample give no hint of heterogeneity, however, which may be duo 
to the fact that any diversity which originally existed in the group was obseurod 
by intermarriage within the community for several generations. The evidence 
suggests quite clearly that the Lachish people were derived principally from 
a population of Upper Egypt which was itself derived principally from emigrants 
who left Lower Egypt about the time of the 18th dynasty. 
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7, Comparisons op thd Lachish and Anciint Egyptian 

aiORlES NOR CHARACTERS CONSIDERED SINGLY 

Comparison of the Lachish and a number of Ancient Egyptian and allied 
cranial series, based on the averages for a number of characters considered in 
conjunction, have been made in the preceding section, The means for the same 
material will now bo compared for characters considered singly, 

A convenient way of summarizing the statistical comparisons of means for 
groups having a number of features recorded, is provided by considering the 
a’s calculated in computing the coefficients of racial likeness. An a is an approxi- 
mation to the square of a quantity which is the difference of the means divided 
by the standard error of this difference. If an a is greater than 10, it may be 
supposed that the difference between the two means is clearly significant. The 
percentages of a’s greater than 10 may be used to distinguish characters which 
are practically con.stant for all the groups from those which frequently denote 
differentiation. 

'I'lic twenty-one comparisons between the Lachish and each of the other 
scries may bo considered first. The numbers of a’s available for this set range for 
the cliffcreat characters from seven to twenty-one, only fourteen of the thirty- 
one coefficient of racial likeness characters being available for all the series.* 
The.so fourteen may be considered first, and they give the following grouping ; 
(a) ciharactors showing a high proportion of significant differences : 

B (percentage of (x’s>l() = 47-0), lOO il/A (42d)), and 100 B/H' (42-9); 

{h) characters showing a lesser proportion of significant differences : 

Nl. (28-6), I-V (28'8), (23-H), J (19'0), and 100 F/L (lO-O); 

(c) characters showing few significant differences: 

L (14-3), 100 NBjNH (9-6), AL (O-ff), G'E (4-H), NB (4-8), and AA (0-0). 

It is clear that some liharaetcrs can be supposed practically constant for all 
the scries, wliile others sliow many significant difforonces. 

'fhe remaining eharaiitors whicih (lan be treated in the same way are only 
available; I'or numbers varying from eight to twelve of the twenty-two series, 
'J'he immlHirs of comparisons are very restricted for those, but they suggest 
that .S', (ip (Ip 100 d'HIdB, 100 (Ija'i, and PL are practically constant for all 
tlie Huries, while the Jjachish is most frequently differentiated by B', Oc.L, fml, 
fvib, U, and rather less fre(|uontly by Gp Op 100 0^0^ and 100 fmbjfnl 

Table XI ll gives the ranges of certain male means for the Upper Egyptian 
group of series (i.e. the ten on the left-hand side of Fig. 3), the means for the Lachish 

* The tmiisvenie arc {/IQ') for the Ijachiah scries is not available for any of the others, and the 
eatimatal cmnial capacities were not included, so the maximum number of oharacters used in 
computing coefficients with the Lachish series is 20. 
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alone, and the ranges for the remaining nine conipriHing the Lower I'igyptian 
group (i,e, all on the right-hand side of Fig. 3 except tlie Lachish and Abys- 
sinian). There is seen to be complete separation of the ranges for the two sets 
of series, with the Ladiish falling within the Lower .Kgyptian liinits, in the case 
of B, B', U, and 100 5///', and a close approach to the sanui condition in the case 
of -7 and 100 BjL. All these measurements are transverse lireadthw, or indices 
including a breadth, and there is no doubt that the values id' the reduced 
coefficients of racial likeness are largely determined by dilleroiices between 
them, Owing to the use of correlated measurements, the dilTerences in cranial 
breadth affect the generalized measure of resemblance unduly. This factor was 
apparently showing far more significant variation in Ancient Egyptian popula- 
tions than any other relating to the cranium. 


TABLE XIII 

Ranges of mean 7imsuraments for two group of Ancknl EgijpUan 
7nale skulls and the Lar.hish means 


Series 

Period 

11 

./ 

ir 

Upper Egyptian typo 

Early prcdyn.-lSth dyn. 

I3I-4 -134;t (]())* 

)23'f! 137'd (H) 

!«)-4 (I2'K (4) 

Laehish 

M, 21)th dyn. 

13l)’« 

12H'.t 

llfcn 

Lower Egyptian typo 

1st dyn.-Roniati 

HIM isica (ii) 

127',')d31-3 (K) 

Il34)4lli'3 (5) 


Series 

Period 

U 

ilKI li/L 

iiKi mr 

Upper Egyptian type 
Laehish 

lower Egyptian typo 

Early pfedyn,--!8tli dyn. 
ctt, 2flth dyn. 

1st dyn.-Ronian 

r>m sw-4 (4) 
filK-l 

r)ltl'8-518-7 (Ii) 

7 1-7 73-7 (1(1) 
74'3 

73'7 7tHI («) 

. 

‘Ml WI-l (Id) 
102-4 

102-:) (11) 


Series 

Period 

h 

//' 

A'/ 

Upper Egyptian type 
Laehish 

Lower Egyptian type 

Early protlyn,-18th dyn, 
ca. 26th dyn. 

Ist dyn, -Roman 

182-2-185-2(l[)) 

lH4-fi 

lBl-4-lHf)-8 (i)) 

13H 13fHl(tll) 
l.'iii'H 

1311-7 13(1-11 (il) 

(IP -SI tl7 -(1(8) 
M'-n 

(iii'M (8) 




fml 

fmb 

Oc.h 

Upper Egyptian type 
Laehish 

Lower Egyptian type 

Early predyn.-18th dyn, 
ca. 26di dyn. 

1st dyn.-Roinan 

34-9-30-4 (4) 
37-0 

36-1-37-0 (6) 

28- 7-.3()'l (4) 

3(1-, 7 

29- 7-30-2 (.7) 

M-2 B2-3 (4) 
fiii-ri 

r)(i-9-fll-.7 (3) 


* The numbers in brackets indicate the number of Borira to which the mnges relate. 
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The remaining characters treated in the table make no clear distinctions 
between the two seta of series, and the same situation is observed in the case of 
all the other coefficient of racial likeness characters. The Upper Egyptian series 
show a slight tendency to be more prognathous than the Lower Egyptian 
(judging by iV/l), but the two ranges overlap appreciably. It may be noted that 
the foraminal length and breadth of the Lachish type are extremely large, and 
its occipital index is extremely low compared with those for the other series. 

The majority of the measurements not used in computing the coefficients 
appear to be fairly constant for all the Ancient Egyptian series, and the Lachish 
means fall within their ranges. This is. so far the bimaxillary breadth ((?£), in 
spite of the clear differences found between the bizygomatic and calvarial breadths. 
The index 100 {B-H')jL does make a clear distinction between the Upper and 
Lower Egyptian sets of series, but this again appears to be due to the fact that 
a breadth measurement is involved. 

The simotio measurements, giving estimates of the “flattening” of the nasal 
bridge, are only available for a few series. Comparison with, the data for these 
given by Woo & Morant (1934), shows that the breadth of the nasal bones 
(*S'C) is unexceptional for the Lachish type, but the subtense (SS) and index 
(100 are decidedly larger for it than for the Badari, Kerma, Sedment, 

and an Ancient Nubian type. The greater curvature of the nasal bones in the 
Lachish skulls places them within the range found for European populations. 

Comparative material for the malar hone measurements is still more re- 
8tri(!tod. Comparison with the data given by Woo (1937) suggests that the 
Lacinsh means are unexceptional, but for the index measuring the curvature of 
a horizontal section of the bone (100 SjG, ml). For this the Palestine series has 
a mean which is decidedly greater than those for the two Egyptian series, and 
this mean places it close to tlie extreme for the types from all parts of the world 
hitherto described, For the vast majority of measurements the Lachish skulls 
are not distinguishable as a group from series representing Ancient Egyptian 
populations. I'he most distinguishiug features of the type appear to he the 
exceptional curvature of the nasal bridge and horizontal sections of tlvo malar 
bones. 'I’hiH divergence is suggestive, but more abundant comparative material 
would be needed to assess its signiflcancc. 

S, fl’HM CONTOURS ON THU LaCHISH SBRIES 

Contours of adult skulls in the Lachish series were drawn in accordance with 
the methods used in a number of earlier cranioraetrio studies published in 
Biornelnka, and a selection of the total group had to be made for this purpose. 
Types were constructed from the measurements of these contours in the usual 
way (Figs. 4~9), and they relate to material from the four tombs combined 
(see p. 123), The selection was made by fimt excluding all specimens too in- 
complete to give the Frankfurt orientation, and then excluding from the 
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remainder those for wliicfi the iiorizoutal and trairsverse .secdioiiK arc (hdcchive 
to more fcJian a slight extent. Tlic totals included are. 108 tnah; and Hi) female 
crania, and almost all the measuroincnts used in (ionstnuiting the ty pes (Tahle.s 
XV-XVII) are available for every one of these speciimenH, except in the case 
of facial measurement.^ of the sagittal contours for whi(di tho numher.s vary 
appreciably. The type contouns (jf the Lachish scrie.s are biwed on sampli'.s which 
are almost as large as any previously used for the purpose. 

A comparison is made in Table XIV between certain measurements of the 
.sagittal type contours and corresponding (:alii)cr meiwureinents for the total 

TABLE XIV 

A comparimn oj mean caliper and lype, contour miuinurcmcntH 


Ctiaractor 

Mate 

Pciiinlo 

Contour 

CttlijKir 

C<»nUnir 

Cllli]KT 

L 

18rf3(in8) 

184-5 (322) 

I7K-K (89) 

170-H (259) 

ir 

IDfi-O {0(i) 

133-8 (208) 

127-0 (74) 

128-4 (213) 

S,' 

li:Mi(]08) 

112-9 (299) 

110-2(89) 

lOK-7 (248) 

SJ 

IKM (108) 

IIO-O (323) 

111-8 (89) 

112-1 (2511 

Og 

()(i'7 (!I7) 

90-3 (280) 

94-5 (79) 

94-0 (210) 

fnil 

30-7 (90) 

37-0 (247) 

35-1 (7-1) 

35-8 (193) 
tiO-H (87) 

G'H 

69'7 (00) 

70-1 (98) 

011-2 (4-1) 

QL 

04' 2 (02) 

94-3 (89) 

91-1 (3K) 

IHMi (70) 

LB 

lOl'H (90) 

l(M)-7 (243) 

97-1 (74) 

90-4 (200) 

PI 

87'MI (00) 

80''-0 (HI) 

80 -.5 (44) 

H-l -9 (02) 

Nl 

63''-4 (02) 

04'’-0 (89) 

05"- 1 (38) 

IM -5 (75) 

Al 

7.'5°-2 (02) 

73"-9 (80) 

73‘-'-0 (38) 

73 -7 (75) 

Bl 

41“'4 (62) 

42'’-0 (89) 

41"-3 (38) 

4r‘-8 (75) 


series. The former are either used in the construction of the typHi {(I'll, L'L, LB), 
or calculated from measurements used in its eoimtruc.tion {//', Aj, B,^,fml, 
Pi, Nl, Al and Bl), or measured on the ligure (A). In coinpariug the means 
of the two kinds, it must not bo forgotten that am scries rt‘presentefl is a 
soleoted group of the other. It might he antk;i))ated that the jirocess of selection 
described would favour the larger and stronger skvdls, imu’e likely to he well 
preserved, and hence rather larger means would he expsutted for the contour 
series. The divergences between corresponding values arc? ac^tually found ti» las 
very small, Of the nine absolute measurements the male contour mean exceeds 
the caliper in six cases, the largest diiference being 1-2 mm., and the piositiaii k 
reversed for the other three, which show a maximum difference of 0-4 mm. For 
the same measurements the female contour mean exceeds the caliper value in 
five instances, the maximum difference being 2-0 mm,, and the maximum 
difference for chords showing the caliper greater than the contour mean is 
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O'H mm. The angles show a close correspondence. On the whole there is a good 
agreement between the measurements obtained in the two waySj suggesting 
that the contours were drawn with sufficient accimacy, and that the sub-series 
gives a fair representation of the type for the total series, 


L A R 



fig, 4. Tmnsverfic, typo oontcwr buflcd on KIH nialo Laohiah akulla. 


The type contours (Figs. 4-9) have no striking peculiarities, and super- 
ficially, at any rate, they appear to be very similar to those given for Ancient 
Egyptian and even some European series. They are average in size, show an 
orthognathous facial skeleton, and only moderate muscular development. The 
male contours are larger than the female in all respects to the same extent as 
is usually found. 


10-2 
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Fig. 10 shows the three male Lachish typc.s and tho.sc given by Miss (.ollett 
(1933) for tlie Kerma, and by Miss Stoessiger (1927) for tins Badari series, super- 
posed, There is clearly a close agreement l)etwceii th(i average (mntours for these 
two Ancient Egyptian and the Palestinian scries, thongli the dilferenees are 
probably almost as great as those wliioh would hes found bc.twcmi any pairs of 
the types of Ancient Egyptian series. The Kerma and Badari scries are assigmsd 

L A R 



by their mean measurements to the Upper Egyptian grouf), ami (he Lachmh 
belongs to the Lower Egyptian (see Fig. 3). The superposed types show tlie 
greatest differences in calvarial breadth (transverse ami horixtmtal seef ions), 
and decidedly smaller ones in calvarial lengths (horizontal and sagittal), as 
would have been anticipated, The Lachish facial skeleton is seen to be rather 
less prognathous than the Kerma or Badari, but the section of its riMal hones 
is the most projecting (cf, p. 145). The type for the predynastic series is the 
smallest in nearly all respects and particularly in facial height, 
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TABLE XV 

Mean measuremmts of transverse contours of the Lachish skulls 
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TABLE XVI 


Mean measuremeniH of horizontal contours of the La&Ush sJmlls 
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TABLE XVII 

Mean measurements of sagittal contours of the Laehisli skulls 


Ordinates above Ny 


Sex Ny 


0=N Ni 1 


2 

3 

4 

72'6 (108) 

80‘1 (108) 

84-2 (108) 

69-8 (89) 

76-7 (89) 

SO’S (89) 

1 

■[H 



Ordinates above Ny 


8 

9 

73-7 (108) 

66-0 (108) 

69'9 (88) 

60-6 (80) 


M) (H9) 


Ordinates below Ny 


2 


Vertex 


cc 

9 

O 

S' 

db 

38-7 (108) 

47-3 (89) 

37-7 (89) 


Sex 

Bregma 


X from N 

. . ! 

y 


GlaboUa 


X from y 

y 

6'1 (108) 

27-1 (108) 

6'3 (89) 

254 (89) 
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table XVII [Gont.) 

Mean measvnments Of sagittal contours of the LacMsh sMls (cont.) 


Aur. Pt. 


Opiatliion 


X from N y X from y y m from y y Prom y Prom N 


106-3 ( 96 ) 101-8 ( 96 ) 
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9. Comparisons op the Lachish series with series op crania 

AND LIVING PEOPLE OTHER THAN EGYPTIAN 

The comparisons made above have shown that the Lachish sories is very 
similar in type to several Ancient Egyptian series. In fact it bears a.s close a 
resemblance to some of these as they do in general to one another. Hciicc it 
may be concluded that the population of Lachish in the year 700 b.c. was 
primarily, at least, of Egyptian origin. The cranial evidence from other countries 
of the Near East is very meagre. Comparisons are made with the only other 
Palestinian series of any length, and with a few series of living people, in this 
section. 

A considerable number of hpman skeletons was excavated by Prof. R. A. 8. 
Macalister at Gezer, 17 miles south-east of Jaffa, from 1902 to 1905 and 1907 to 
1909. These are dealt with in a chapter in his report (1912). Material of tlie jjre- 
Semitic period was very fragmentary, and no mea.surementN of it (!ould bo taken. 
Five indices for a series of skulls of the Semitic periods are given in the form of 
frequency distributions only. Division is made between a series for the first 
and second periods together, and a series for the third and fourth periods 
together, in the case of the cephalic index. A distrilnition of reconstructed 
statures is also provided. The material was apparently unsexed, and it is not 
clear how the absolute frequencies could be determined from the diagrams. 
All that can be said is tliat the average cephalic index for botli series of unsoxed 
skulls is about 76. The Lachish male mean is 74'3, and the female 75‘5. 

There appear to be no published records for any other long series of skulls 
of any date from Palestine, or any neighbouring country except Egypt, Measure- 
ments of small numbers of ancient Jewish and Phoenician and modern Arab 
specimens have been given, but these are practically worthless for statistical 
purposes. There are no adequate data for Ancient Jewish skulls from any 
locality. The longest modern series representing this people is one imblished by 
Prof. J. Matiegka (1926) of seventeenth-century Jews buried in Prague, 'fhe 
average cephalic index for the fifty- three male skulls is 82- 0 , which is sufficient 
to show that there can be no close connexion with the Lachish pe(q)Ie (74-fi), 
or with any of the Ancient Egyptian groups, for which the liightmt index is HH). 
The coefficient of racial likeness was computed for twenty-two cdi ariusters between 
the male Jewish series from Prague {n = 32-8) and the Lachisli series pi = 2()2*8) ; 
a reduced value of 62-6 is found. This is greater than the maximum found 
between any pair of the series of Egyptian typo, including the Lachish (see 
Table XII). It is not suggested, of course, that the Palestinian Jews in 700 b.u. 
were necessarily of the same type as Jews in Prague in the seventeenth eentury. 

Measurements of a certain number of Jewish people in Palestine have been 
pubhshed, and only the cephafic index for series of men will be considered here. 
No accurate comparisons with cranial data can be given in the case of any 
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other characters recorded. Weissenberg (1909) gives a mean value of 79-8 for 
fourteen men in Galilee. All the remaining series of Jews in Palestine represent 
the Samaritans, the sources and mean indices being: 

Weissenberg (1909)— 76-2 (20); Kappers (1934)— 77'2 (27); 

Szpidbaum (1927)— 77-6 (27); Huxley (1906)— 78-1 (36); 

Genna (1938)— 79-1 (39). 

In view of the small sizes of the series, these averages accord remarkably well. 
Ihe pooled mean for the 148 men is 77-9. 

Weissenberg (1909) has also given cephalic indices of 76'9 for twenty-five 
fellahin measured near Jaffa, and 76-7 for thirty in the Saf'ed district, or a pooled 
mean of 76-2 for the fifty-five men. This value differs markedly from that of 
81-0 given by Kappers (1934) for 139 Arabs from “towns north of the line 
Jaffa-Jericlio”. There appear to be some well-marked regional differences 
between Arabs in different localities in Palestine. 

I'he mean ceplmlic index for the male Lachish serie,s of skulls is 74' 3, which 
win bo supposed to correspond to a value of about 76*3 in the living. The latter 
is d(3oidedly less than the mean for one series of Arabs, rather less than the 
Jewish means, and pracitically identical with that for the Arabs measured by 
Wei.sHonborg. It is quite unsafe to lay stress on (iomparisons of a single character, 
but those made show that a population with the same ce])halic index as that of 
the ancient inhabitants of Lachish is living in l^alestine to-day. The possibility 
that tlio pre-Ohristiun type has persisted until modern times is not precluded, 
though far more ovidenoe — and particularly that of later series of skulls from 
the country— -will be needed to disclose its racial history in any detail. 

19. MANI)IBr,U,S AND LONG BONiiS OF THE LaCHISH SMRIES 

In ail tliere are seventy-six mandibles in the Lachi,sh series — fifty-six from 
f omi) 129 and the renuiiiulor from the three other tom])s — most of tliem being 
defw'iive to some extent. Gf the total, one is associated with an adult male 
eninium. and nine with adult female erania. The remaining sixty-six are un- 
asHoeiated, and they were sexed by anatomical aijpreciation. Remarks on a few 
of the HpecimeiiH noted as being anomalous are in § 4 above. Measurements 
were taken of the adult Imnes in aecordanco with the biometric toolmique 
(Morant, ( 'olkitt & Adyanthaya, 1936). There are thirty-four male and thirty-five 
female specimens. Means for these are given in Table XVIII, and comparisons 
with other material would not be profitable, as it has been shown (Cleaver, 1937) 
that considerably larger numbers would be required to reveal small racial 
cliffereuceK in type. Neither the measurements nor the appearance of the 
Lachish mandibles suggest any clear divergence from Ancient Egyptian types. 




Other parts of the skeleton are only represented in tlie Lachish series (all 
tombs) by two sacra and nearly 200 long bones, many of which are incomplete. 
These are not associated together and no attempt was made to sex them, The, 
maximum lengths of the adult long bones were determined, and means for them 
are given in Table XIX. As far as can be seen from these constants, the Lachish 
people were rather short, but no approximation of any value to the average- 
statures of the men and women can be given. One ulna {No. 2) has a healed 
fracture of the lower shaft, and one femur (No. 39) has condyles affected l»y 
arthritis, 

TABLE XIX 

Means of the maximum lengtJis of unsexed adult kmg 
bones of the Lachish series 


remora Tibiae Humeri 

(oblique) (oblique) (oblique) 




298'6 (11) 243-7 (fi) 2 


Clavietra 


M7-2R (4) 


L 430'46 (20) 372'2B (8) 300'1 (16) 239-8 (6) 2(i(5-() (9) IfhH) (1) 


11. SUMMAEY ANB CONCLUSIONS 

The skeletal remains reported on in this paper for the Trustees of the late 
Sir Henry Wellcome were collected at Tell Duweir (Lachish), twenty-five miles 
south-west of Jerusalem, by the Welloome-Marston Expedition to the Near 
East from 1933 to 1936. The bones were found in four adjoining tomb ebambers 
and they are assigned to the seventh and eighth centuries B.c, In all there are 
696 crania, the majority of which are more or less imperfect, and much smaller 
numbers of mandibles and other bones of the skeleton (see table on p. 103). 
Of the crania 360 were judged to be adult male and 274 adult female, the 
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remaining 8ixty-one being iminatm-e. The origin of the collection is discussed, 
and it is concluded that the remains are probably those of people who died as 
the result of some catastrophe. The frequencies of occurrence of different states 
of closure of the principal calvarial sutures show, from comparisons with other 
cranial series, that the adults from Lachish were younger, on the average, than 
cemetery populations are expected to be. Very few aged individuals were 
interred in the tombs, The normal order of closing of the sutures was sagittal- 
coronal — lambdoid. Remarks on unusual conditions and anomalies are given, 
the most interesting specimens being three trepanned skulls ; two showing marked 
artificial deformation and six suspected to have been deformed artificially; a 
series of seventeen showing premature closing of the sagittal .suture without 
clear deformation except in three cases; one believed to be distorted owing to 
premature closing of the coronal suture; one with absence of the right auricular 
pas, sage ; and one with an extensive di.seased area on the vault. 

Statistics regarding the loss of teeth before deatli show that they were 
remarlcably well preserved. Remarks on dental anomalies are given. The most 
interesting skull from this point of view is one which was found to have a metal 
filling in one of its molara, presumably awprired by accident. 

Judging from oomparisoris of the raeasurements, tliere is no reason to doubt 
that the series from tlie four tombs represent precisely the same population, and 
the differences found between the male and female adult aud juvenile constants 
are no greater than those expected in such a case, The variabilities and sex 
ratios of the total series (combining skulls from all tombs) are quite unexceptional. 

Comparisons are made between the Lachish aud twenty-one Ancient Egyp- 
tian and allied series of skulls by the method of the coefficient of racial likeness, 
and a classification of the material is inesented. Tlie relationships found suggest 
that the population of the town in 700 b.o. was entirely, or almost entirely, 
of Egyptian origin, very close connexions with some, contemporary Egyptian 
series being found. Tliey show, further, that the population of Lachish was 
probably derived principally from tTpi)er Egypt. Comparisons of measurements 
considered singly indicate that the Laclusii cranial type has no features which 
would bo unusual for an Ancient Egyptian typo, other than the prominence of 
its nasal bones and the curvature of its malar bones. Transverse, horizontal, 
and sagittal type contours based on 108 male and eiglity-nine female Lachish 
skulls are given, and it is shown that they are very similar to some previously 
provided for Ancient Egyptian series. There are no good records for any series of 
skulls from Palestine other than the Lachish. Its mean cephalic index accords 
fairly well with that given for a series of living Palestine Arabs, and it is close 
to that for Samaritans, The possibility that the Lachish people have persisted 
until to-day is not precluded, hut far more evidence would be required to sub- 
stantiate such a hypothesis. The series of mandible and long bones from Lachish 
are too small to be of any value for comparative purposes, As far as can be seen 
the people were rather short, 

Biometrika x.'cxt u 
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APPENDICES 

I. Definitions of skull measurc.mnts taken 

Measurements of tlio Laohish matorittl were taken in aecordance with biomotrio prac- 
tice, The definitions of cranial points given by Huxton & Morant (1933), and bhoso of man- 
dibular measuromonts givtsn by Mornnt, Collett & Adyanthaya (193fi), wore followed. Tho 
contractions bolow are used to donoto mea.surernente in the tables and text, and thoir 
numbers in Martin’s list are given. 

0 = capacity in c.c. It was not possible to dotennino tho capacities of any of tho Lachish 
skulls directly, owing to the fact that tho intorions of the brain-boxes are coated with mud 
and wax which cannot bo removed. Tho reconatraotion formulae using L, B, and H' given 
by Pearson & Stoessiger (1927) were applied to give tho estimates in Tabls» VI, and tlieso 
wore not used in computing coeflioienta of racial likiuieas. L = maximum glabella-occipital 
length (M. 1). B = maximum horizontal breadth (M. 8). 7/' = biwio -bregmatic height 
(M, 17), LB = bosion to na,sion (M. 6). B' = minimum frontal breadth (M. 9). >S‘ = arc 
nasion to opisthion (M. 25). = arc naaion to bregma (M. 26). iS'j = arc bregma to lambda 

(M. 27). (S'j = are lambda to opisthion (M. 28). A'J — chortl nasion to bn'gma (M. 29). 
S'i = chord bregma to lambda (M. .30). iSJ = chord lambda to opisthion (M. 31). U ^ hori- 
zontal oircumforonco measured through tho ophyron and directly above the stipf'rtnliary 
ridges (M. 23 a). PQ' = transverse circumference from one auricular point to tlio (tther, 
passing through brogma (M. 24), fml = basion to opisthion (M. 7). fmb = maximum brcuidth 
of foramen magnum (M, 10). O'H = nasion to alveolar point (M. 48). OL x-.: basion to 
alveolar point. OB = facial breadth between lowest points on zygomatiomaxillury sutures 
(M. 46). J = maximum breadth between zygomatic areh(;s (M. 46). A'//, L « ntision to 
lowest edge of pyriform aperture on the loft side, NB = maximum breadth of jiyriform 
aperture (M. 64). OiL = maximum breadth of loft orbit (M. 61). O^L s maximum height 
of left orbit (M. 62). (?( = length of palate from orale to 8tai)hylion (M. 62). 0^ = breotlth 
of palate between iimer alveolar walls of second molars (M. 63). OH = maximum projec- 
tion from biporial axis in the transverse vertical piano, measured on transverse contour. 
(S'C'ssimotio chord, minimum breadth of nasal bones (M. 67). SB^mihUmm of simotio 
chord. Measurements of the loft malar bones, taken in accordunco with Woo’s instructions 
(1937), aro; Ml^ = minimum horizontal arc. MZ, = minimum vertical arc. €(ml] is chord 
between terminals of horizontal arc. S{ml) = maximum subtouso from tho chord, Tho 
occipital index is tho ordy one which needs definition: it Is 



Values for tho individual skulls wore found with tho aid of Miss Tildosloy's tablo of 
the function {Biometrika, 13 (1921), 261-2). P/. = profilo anglo botwtson Frankfurt 
horizontal plane and the chord joining nasion to alveolar point. iV/., A/, and 7iZ. are 
tho angles of tho triangle of which the nasion, alveolar point and basion are the iipicos. 
^ 1 = maximum breadth outside condyles. OvZ = maximum length of tho loft eondylo. 
r&'=minimum antero-postorior “breadth" of the loftrarnus. = chord Ixitwwin tho 
points on the outer loft alveolar margin from the middle of tho second molar to tlu^ middle of 
the first premolar, = symphyseal height from intradental to the point fartlusst nanovod 
from it in the symphyseal plane, zz = minimum chord between the antorior margins of 
the right and left foramina mmtalia, 0 ^ 0 ,. = coronial breadth from right ooronion to 
left ooronion. ilfZ. = mandibular angle. c,Z = projoetivo length of tho corpus. rZ = pro- 
jeotive length of the left ramus, = chord from left gonion^to right gonion, wZ = maxi- 
mum projective length of the mandible, eft = projective height of the loft coronoid proew. 
mfi = projective height of the corpus at the middle point of the outer alveolar margin 
of the second left molar. Rl. = angle of condylar-ooronoidal line with ramus tangent. 
C L =angle between the standard horizontal plane and the line joining the iufrodental to 
the most anterior point in the standard sagittal plane of tho symphysis. 
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LIST OB' PLATES 

I’lato I. Views of the interior of Tonih 120. 

A. Entrance of tomb showing skulls round side. 

B. Intoriof of tomb with skulls collected round side. 

Plate II. Tomb 120, oolleotions of skulls in sUu, 

A. Skulls collected round wall. 

B. A closer view of skulls seen to the left of the beam in Plate Ia. 

Plate III. Exceptional crania. 

A. A female skull (No. 485) showing extensive burnt patch. 

B. A male skull (No. 108) showing injuries probably inflicted shortly before death: see p. 104. 

C. D. Male skulls (No. 166, Tomb, 107 and No. 179, Tomb 116) of exceptional type; no other 
apeoimons of this type wore found. 

Plato IV. A male oranium (No. 340) with hole In right parietal, probably a trepan, and a sword-cut 
on the same bone oloao to the coronal suture. 

Plato V. Male omnia with trepanned openings. 

A. No. 116. There is a septic area round the trepan in the right parietal. A out probably made by 
the trepannor can be seen by the lateral outline of the loft parietal. 

B. No. 114. 

Plate VI. Crania showing marked artificial deformation. 

A. A male oranium (No. 381). 

B. A female oranium (No. 673), 

Plate VII. Norma lateralis views of five male crania (A-E) believed to be artificially deformed to 
a slight extent, and of a male oranium (B), probably deformed owing to premature closing of 
the coronal suture. A, No. 378; B, No. 376; C, No. 377; D, No. 376; E, No. 379; E, No. 380. 

Plate VIII. Norma verlicalis views of three anomalous crania. 

A. A male oranium (No. 380) with coronal suture obliterated. A norma lateralis view of this 
deformed spooiraen ia shown in Plate VIIx. 

B. A female cranium (No. 464) with healed Injury on loft side of the frontal bone. 

C. A metopic female cranium (No. 670) with wound on left parietal bone and the sagittal suture 
prematurely obliterated. 

Plato IX. Norma lateralis views of a typical female (No. 388) and a typical male (No. 28) cranium 
of the Laohish series. Other views of these two siKjcimons are shown in Plates X-XII. They 
wore Bolcfited from among the more complete skulls on account of the fact that their measure- 
ments of shape show no marked divergcnis^s from the means for the Borios. In the ease of 
No. 3HH, none of tluisc measureinents diiler from the female mean by more than the standard 
deviation of tlie distribution. 'Tiie same is true for No. 28 in comparison with the male moans. 
All the photographs of the typical skulls are approximately 0-6 natural sisie (linear dimensions). 

Plate X. Norma fadalis views of a typical female (No. 388) and a typical male (No. 28) oranium 
of the Laciiisli scrie.s. 

Plate XI. Norma verlirnliH views of a typical female (No. 388) and a typical male (No. 28) cranium 
of the laichish series. 

Plate XII. Norma occApilalis views of a typical female (No. 388) and a typical male (No. 28) 
cranium of tho Laohish series. 
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PJate XJII. Crania Bhnwing tihliU-rntinii (tf thn wigittal nuturti. 

A. A fomale cranium (No. 1172) shnw iiij^ ('<)imtlct4‘. fiblit«'rati()ii oF Hui Hiiturn witli dintortinn, 

B. A male cranium {No. DIM) alinwing (■uiiiiilctn, iibUtomlion of tin* wituit' witli ilintortinn, 

G. A female cranium (No. fl(17) Hlimving fttinplob'. (iblitfrafinn of tlu' miturc'. witli dinbirtion. 

1). A male cranium (No. 3Hii) Hhowing wunjilcl*’ oblitoratioii of the Huture without upjiarout 
distortion. 

B. A female cranium (No. .777) alunving almtwt eoiniilutn olilitenitioii of the nutims and {KHit- 
coronal constriction. 

Plate XIV. Three male crania with autural HnomalicH. 

A. No. 299. Largo wormian hone in right side of eonmnl nuturc. 

B. No. fil). Two symmetrical iuteriwrictal iHuicfi of unusual form. 

C. No. 339. I.s‘ft side, showing lem{(oral Isine. largely fimed to jHirietiil. Thu right side of this 
spccimou is nifec.ted in a similar way. 

Plato XV. Anomahms regions of two timlu ernnin. 

A. No. 324. Gomiilcte alimmtsi of right mirieular jmiwngc; twice natiind size, Thu left itnriuidar 
paasago is nornial. 

B. No. .301, Gompkdc iilimmce of left jugular fonuiiun: Ml tiincw nat ural si/.u, 't’liu right foramen 
is normal. 

Plata XVI. A male cranium (No. :iH2) of unusual form, iswsihly afleiiU'd hy liydrowphaly. A, 
Nnrma laUralu\ B, A'ort/w /(icm/iVt; (', Stfnna rrriimiiH. 

Plate XVII. A fomalu cranium (No. fW2) with largo disnawd anm on frontal Imnu (mm p. llfi), 
A, Nmna laimlk; B, G, Nitrnui mtmlk. 

Plato XVIIT. Exceptional crania, 

A. A male cranium (No. fi) with largo hralwl wantiul on fnmtnl Ismu. 

B. A male cranium (No. i) with dimxiacKi anm oii right imrictal. 

C. A raotopio female uranium (No, 419) with injury to the right fronUd Ixine. 

I). A female cranium (No. .MS) witli dupre«»ion in loft parietal. 

Plate XIX. A fomale craiuum (No. filH) showing a NkiUi (iipisir right second molar) with a tilling 
presumed to be adventitious. 

A. The palate (1*2 diameters). 

B. The second molar with the filling m situ (O-O dmincters). 

C. A skiagram (2-7 diameters) showing tlm doplli of the filling. 

1). The three molars (2-7 diameters) after removal of the, filling in the second molar. 

The skiagram In this plate and others in Plates XXII and XXIII were kindly pmvidetl by 
Mr 0. Bowdler Henry, M.R.C.H. 

Plate XX. A female skull (No. 487) with anomalous jaws. 

A. The mandible from above, showing dontielea outside the denbil arcli. 

B. The palate showing a diastema between the wmtral inuisora, and diantemiw bolwoen the 
lateral incisors and ejanines. 

0. The right side of the mandible showing denticles. 

Plate XXL Palates with anomalous dentitions. 

A, A female cranium (No. 44fi) with sookete for three inoisore only.' 

B, A male cranium (No. 132) with diastemao between canines and premolars, and third molars 
absent. 

0. A juvenile cranium (No. 706) with supernumerary tooth behind the right oeatral incisor, 

D. A female cranium (No. 401) with grossly deflected left canine. 
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Plate XXII. Photographs and skiagrams of jaws with anomalous dentitions. 

A. A female cranium (No. 506) showing incomplete eruption of the second and third right molars. 

B. Skiagram of the same specimen (A) showing that the roots of the partially erupted teeth 
were fully formed. 

C. A female cranium (No. 383) showing a denticle in the palate. 

D. Skiagram of the same (0) showing the limits of the denticle and its orypt. 

E. A female cranium (No. 496) with impaction of the upper left third molar. 

E. Skiagram of the right side of a mandible showing two denticles in the corpus, female (No. 
1066). 

Plate XXIII. Jaws with anomalous dentitions or cysts. 

A. A skiagram of a juvenile jaw (No. 1068), right side, showing canine and first premolar 
unerupted. 

B. A skiagram of the same mandible (A), left side, showing the same teeth on this side unempted 
(see p. 120). 

C. A male skull (No. 72) showing the third molar on the left side in. abnormal position. 

D. Occlusal view of the same jaw (C) showing the abnormal position of the third molar, rotation 
of the second premolar on the left side, a retained milk canine, and absence of the third molar 
on the right side (see description on p. 119). 

E. A female cranium (No. 469) showing a large cyst in the right molar region. 

E. A female cranium (No. 467) showing a largo cyst in the anterior part of the palate pene- 
trating to the nasal aperture. 
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Biometrika, Vol. XXXI, Parts I and II Plate III 

Risdan: Skills from Tall Diiweir {Larhisk) 



A. A female akull (No. 4Sfl) showing 
oxtensivt' biu'iil; patch. 


H. A male skull (iVo. IDS) Mliowing iiijuriiis 
which pi'olml)ly (iiuisi'il di'Utli. 



C. A male .skull (No. 156) of exoejitioiuil 
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Biometrika, Vol. XXXI, Parts I and II 
Risdoii; ,Skulln from Tdl Ihiimir (Lnchinh) 


Plate V 



A. No. 11,5. 



n. No. 1 14, 

Male crania with trepanned openings 




Biometrlka, Vol. XXXI, Parts I and 11 

liisdou; likulh from Toll /hiwnir {Ldrhmh] 


Plate VI 



B. No. 073. 

Marked artificial deformation of a male (A) and a female (B) cranium 




Biometrika, Vol. XXXI, Parts 1 and 11 
Piimltm-. iSkiilluJmm Duwr.ir {ImcMhIi) 


Plate VII 



• .<*• 


A. Nil. ;i7K. 



B. No. ;i7f5. 



10. No. :i7i). K, Nil, liHd. 

Ive male crania (A E) believed to be artificially deformed to a slight extent and a mala 
cranium (P) probably deformed owing to premature closing of the coronal suture. 





Biometrika, Vol. XXXI, Parts I and II 
Kisdou: iSkiiUn from Tell Duweir (Larhish) 





A typical female (No. 388, above) and a typical male cranium (No, 28) 
of tlie Lachish series, 


Plate IX 
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Risdon; tSkuliaJrmn 2’dl Duwcir (Lm-h'mh) 


PliUf XI 



B. Ml). 2H. nuili'. 

A typical female and a typical male cranium of the Lachlsh series. 
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Rirfildii; tSkuUnJroin Tell DHn:('.ir (/jMcA/.s/i) 




A typical female (No. 388, above) and a typical male cranium (No. 28) 
of the Lachish series . 
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Eisddu; Skulk Jmm Tdl Dnwc.ir [hwliwh) 


Plutr XIV 



A. Nu. 2S11), large woriiiiau hone in rigid hIiIi* 
of coronal Hotiire. 


B, No, (i(), two Kyiiiiiii'irieal iuiei'iiitrieliil 
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Male crania with sutural anomalies, 















A female 







A tooth (upper right second molar) with a filling presumed to 
be adventitious! No. 518, female. 
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Kisiliiii; SkuUn Jnwi Till l)ii>irir 


Platv 



A. 1 liiil Mill tin- f'ltti I il> iili«! ill' d 


B. Diastnimii' ln'twi-iMi ra-utral iuciwii'M ami 
between Inteml iiieisoi's and eiiiiiiiea. 


JH'iitieli'H iiulaide (lie right Hide iif (he 
Iiiwer dental arch, 



The anomalous jaws of a female skull (No. 437). 
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Plate XXI 
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Palates with anomalous dentitions. 
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Foreword by C. E. ZoBell 

Several different methods have been described for estimating the densities of 
bacterial populations in solutions. The most popular of these are plating pro- 
cedures, the minimum (successive) dilution method, and various direct micro- 
scopic counting methods. The accuracy of the two former methods is predicated 
upon the ability of the bacteria to multiply in nutrient media. Since no one 
medium under any one set of conditions can provide for the multiplication of 
bacteria with highly diverse nutritional and environmental requirements, it is 
not surprising that direct microscopic counts on materials containing a hetero- 
geneous bacterial flora are usually appreciably higher than plate counts or 
dilution method counts. However, direct microscopic counts fail to differentiate 
between dead and living bacteria and are beset with almost insurmountable 
technical difficulties due to the minuteness of the bacteria. Therefore the choice 
of an enumeration procedure usually rests between plating procedures and the 
dilution method. 

Briefly, and in its simplest form, the dilution method consists of the following 
procedure: A sample (e.g. 1 c.c.) of the solution under investigation is taken and 
inoculated into sterile nutrient medium in a test tube. Part of the original solution 
is then diluted in a certain ratio (usually 10-fold) and the same size sample of this 
diluted solution is inoculated into nutrient medium in a second test tube. This 
process is repeated as many times as seems necessary; so that in the end the 
experimenter will, have prepared a series of inocula representing successive 
10-fold (or some other ratio) dilutions of the original solution, The highest dilution 
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in thiH HericH whii-h hIkiwh if« «ln-H Ut jiniji jtlf* the mugiiiiinid tif tliw 
bacterial cloiwity in the Krigiiinl wiIuU'.k, 

Siidi an cxpcriiin>nt. htmever, »)S»vitJM«ly tells inbhing !»«l an int<>gral lanvcr 
of 10 which lunrc or hm accurAtcly drt^THun*^ wbal call Hie “ order 

of magnitude" of the density. In nhtain greater rtn uracy. invest igatora (cf. 
Halvoraon & Ziegler, Ut.'W) liavf* aiipgestiH! thid two f«» ten or jijore tnbea of 
nutrient medium be inoculated with each rlilulion of the niatcria!. rjum the 
basis of resultH obtairicrl in Hiitdi imdlijile tube csjwriinents, «*vcral methodH of 
estimating the baclcriid deii.sifiea which gave riw to tinwc rcsult-H have lieen 
suggested, as desfiihed by Mr (Jordon, 

In comiiaring the mininnnn dihdbui inetiuKl witb iilstiny |»roccd»rcH for the 
enumeration of marine bacteria, in this lahimitory. the dilnfitui ntetbod prob- 
ability tables of Halvonion & Ziegler were. ««si, 1'lieae taldes give e.nti- 

mates of bacterial deusitiea based on tin* luunlK'rK of “jswhives" |i.e. showing 
growth) ohserved in ten inucula of each of Uir«; aui-wwive U» fold dilutions. 
More dilutions than this (e.g. live tw six) were of course in iiuwt enses prepared, 
and the estimates were laised on the uiont critical »e( of three of these tlihiiituiH, 
The •resultfi revealed that the dilution nielhotl ('HtiJiiateK averagcii alumt 
20% higher than itlate counts on the wune lunteritd, although on some in- 
dividual samples they were actually lower. Dupllcale deterininations by the 
two raethoda on the same samples of material showtsl that the degree of x'cjrro- 
ducibility of the plate csounts was nmeli higher than that for the dilution method 
counts. In view of the fact that the dilution method has apidicntions whms the 
plate count cannot ho used, further eflbrls wore, marie to inereaw il« accuracy, 
This can bo accomplished in either of two ways; namely, by inotmlatiug more 
tubes with each dilution or by using more dilutions. The latter alternativo was 
tried in which, instead of diluting each time by )(i-foId, tlu! tlihitirms were made 
^/lO-fold, Under these conditions it seemed that more reliable and more repro- 
ducible counts were obtained by the dilution method even when calculated by a 
crude arithmetical method. 

It was at this stage in the oxja'riinontal work when Mr (Jordon woa consulted 
to aid with the calculations. After examining the metluals by which Halvowon 
& Ziegler obtained their results, he expreased the apinion that these methods 
appeared questionable. He has developed the procedures preaiMited in the 
following paper for determining the geometric mean esUmates, as he has doscribed, 
It is to be hoped that his method will yield more consistent and more reproducible 
results, A brief r^snm^ of the results has already been published ((Jordon, 1938). 

1. Introduction 

The underlying assumption by which estimates ore made of a bacterial 
population density, from the results of the successive-dilution technique, is that 
the individual bacteria are distributed in the space ocoujiied by the fluid medium 
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containing them, in the same way that we should conceive molecules to be dis- 
tributed in a solution, if each molecule were unaffected by the presence of any 
other molecule. (This is not, for instance, true of molecules possessing magnetic 
or electrical polarity, or of ions,) In other words, the individual bacteria are 
assumed to be randomly distributed in their medium; so that if, for example, a 
sample of 1 c.c, is taken from 100 c.c. of solution, each bacterium in the whole 
solution enjoys an independent chance of 1 out of 100 (probability = O-Ol) of 
being caught up in this sample. 

Now, in so far as the bacteria exist as individuals they undoubtedly do not 
satisfy such an assumption “to the letter”, but probably exert a certain mutual 
uniformizing influence on one another. That is, returning to the example just 
cited, if there are 10,000 bacteria in the 100 c.c. of solution, the probability is 
likely to be in reality 0, and not 0-01^®>®®®, that all the bacteria should be caught 
up in the 1 c.c. sample. This, however, simply implies that in a series of such 1 c.c. 
samples the numbers of bacteria caught up in them are somewhat more closely 
clustered about the true average density, than we should compute them to be 
on the basis of the above assumption of complete randomness. 

In this sense our assumption of complete randomness amounts simply to 
“considering a least favourable case ” if we are computing the probable “spreads ” 
(standard deviations) of actual counts. On the other hand, the assumption causes 
us to over-estimate somewhat the probabilities of obtaining no bacteria at all 
in our 1 c.c. samples, and to underestimate the probabilities of obtaining “one 
or more” viable bacteria in the samples. It is these probabilities that enter into 
the estimate of population density by the dilution method. No possibility appears 
for rationally correcting for such discrepancies; fortunately, however, the two 
errors committed will to some extent cancel each other in the formulae which 
follow. But there is suggested in these considerations the possibility of developing 
rational means of measuring the uniformizing influences associated with various 
specific cultures in nutrient solution. The values obtained might be correlated 
with population densities and various specific properties, and might yield very 
interesting interpretations. 

Another, perhaps more serious, objection to the assumption of “random- 
ness ”, is the known fact that most species of bacteria show tendencies to gather 
in multicellular groups within the solution, as well as to congregate on glass walls 
and thus to go out of solution. These processes must affect all methods of counting 
equally, however, so far as the samples are taken in the same way; hence they 
should not interfere with the comparability of counts made by different methods. 

It might be mentioned here, though, that in comparing plate counts with 
dilution estimates, account should be taken of the fact that the dilution estimates 
include obligative anaerobes, while plate counts do not. 



no EHtitmitin/j liarkrinl Pn}mintvtm htf IHbiiirm, Mdkml 


11. Met!«:hjs Mf I’Kipn.ATn^s yiMtw KEHnxH riF 

Kirt t ' Kh«l V i; ■ I»H .r Tin?! T K* 1! K r K 

AsHuming random diHirilmtion of iiidivi(}u«lfk in a largo vtihtiiic of solufion, 
as deaoriW above, let p rrj>n>«'«t the tnfai «»unl«'r of «ndivi4nals in the whole, 
solution, divided hv the total vislnnn* ««’»ni}»ii*rl. Tlinf i>«. p wprownta the mean 
density of population in the .Hohitimi, in a tnunber of iiiflividiials per 

unit volume. Then the pttdmbiliiy tlmt inn wunplefd'ow* unit volume (e.g. 1 c.e.) 
of solution theit! will t« / individual {where r is ft j«mif ivt* integfr) is given by 

(1) 

which is the Poisson diKtrihution, 

The ])rolmhility that fiicn.! are t> hrieteria is arcnrditiKly ‘ r and the 
probabilitytifohtainingonKornuuehaeteriainthefwnnjde li.e.anynuinberexee.pt 
0) is l--P(U) sss 1 e, Hjiuse thedwlribution {IJiKnonnHli/i'd. Himer* if we take 
ton test-tuhes containing sterile nutrient wdutinn. and inotiihite ear-li with 1 unit 
volume of our solution, then tlu* prohahility that a | In) of lljew liibeK will 
have roeeived Home number of Inu-teria, and (lit-" n) will huv*^ rereivefl none, Ih 

(2) 

by welVknowm rules for computing {irohabilities. If we imdei-stand /» to reproaenfc 
mean density of mMe bacteria, then in (2) rt'.jmwnts likewise the prob- 
ability that n of the ten test-tulfcs at) inoeukted will show growtli, and sinml- 
taneouriy the other (lO-n) of the tulws will fail to alww growth. 

Let us now inoculate three rows of ten lulms each, the first row wdtli samples 
of the original solution, the second row with samples of a HJ fold dilution of the 
original solution, and the third row with aarnpks tif a iOOdbltl dilution of the 
original solution. Let p refiresont the moan density of imlividimk in the middle 
(lO-fold) dilution, and denote by ?r,o, n, and ?%, the numhow t)f tuhtm showing 
growth in the three respective rows. Then the probability of obwrving a roault 
represented by the triplicate » k 

Of(n,a,n^,n^i ) » Qf{n ) « 

_ (10 !)“ (e~p) w -fti (e -aMjio ^ j ( i — e”^)*** ( I - 

( 3 ) 

It is upon these three equations ( 1 ), (2), and (3) that several methods of estimating 
population densities from the results (njo, %,%],)*= » of successive dilution 
experiments are based. (These equations of course refer to successive 10-fold 
dilutions, and ten tests to each dilution. Analogous equations are easily formed 
corresponding to other dilution stages and numbers of teste.) 
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\Vd!H & IVf'li’i f Iwvr puhlWiwl a iiiPtluKl of cstiiiiating jjojiulation 


daisity. I’lif'ir met a l*» thi- jiawitf. rimu Hiiiounts simply to computing 
three vaiu« «f/<. U* ft! rarh of the etpiathinH 






}U--n 
ItT 


w. 


lEzfi- 

. <• -■ j--, 


( 4 ) 


where s,. w»j| are t!n< “ {KHsifivea" in the tiiree series of tests. The 

gwjruftrie tHftui { If nf tlim* thttw. vahie.H is then aeecpted aa an estimate 

of the true valwr of/e 

On tin; jrf if. siirh a way of arriving at an CHtimate appears rather 

naive. Kurh i-rmiimtafiona are mnmlly unsafe when the data are subject to 
large iinwlaiiities. Apparently the only reason these authorH used a geometric 
mean of the flirw valuw instead of wmie ntluw mean is that from the equation 


f- ’’P St 



(5) 


we ohtaiti 
which is linear in 


h»g/> log (log 1 - log r, (0) 

logp, logr. aiifl !og(logy0i-5™~|, so tliat the arithmetic means 


of these terms are rolatesl in the Harm* way as the lerins tliemselves. (Note that 
|[logp,o4 hig/», 1 log/%,, j h!g(//j„p,/%,)‘.) 

However, haeferia {’onnts are iisualiy e.xi/rc«Hed " to a ctsrtnin number (e.g, 3) 
of Hignifieant figiin*s‘'; that is. interest is centred on Dm proparlumite errors, 
not on absolute errors. 1'l»o different ImtwtMm no baitleria and one is more 
important tlnui that f^etvvtani KKW and KiOl. This fonsideration indicates that 
the expectation of log />, given the data, is more, like what ive want than, say, the 
expectation of p. Bttt it d(K»« not follow that the mean of three separate estimates 
of logpis the best estinnUe of logp, imeause tlie estimates are not equally accurate; 
they should Ijb weightwl by the inveiwe square of the standard deviation of logp 
corresponding to an itiwervation a,, which would have to bo computed from an 
"inverse’' distrihiition dcrivwl from (S) by use of Bayes’s formula. 

Halvomon & y/icgk*r (lh33) published a mimcografilied tract in which they 
presented tables giving the modtJH of (see ecpiation (3)) together with the 
corresponding ttmxmium valutw ol' corresponding to various combinations 
(”ioriq,%i) » «. (that is, csonsidering t?^’(w) as a function of p). Their notation 
is not the same as that iiere usod; it is as follows: 


p S3 X « p corresponding to maximum of Gf^(n), 




^10 ” Pi'f % “ Pi' ”<ri " Ps- 

This estimate p of p is obviously that obtained by the "method of maximum 
likelihood", which is asfloeiated with the name of R, A. Tisher. The relation of 
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tins method to the jsriuduk! of invf-m' }«r»»hahihiy h-v W«>!i ihi^rtsw**! hy deffreyH 
{1938a, 61. It is efiiuv8,ieni (« tokiitg a imttorm |irt«<r |(rrtfM}iii!ty Inr p, ami Omn 
adopting the mwle of the ptwtermr pndmhilily a* tl»r Mtmsalr. 

In the pftwiit iuitanc*? tht? stpniftramv nf a ,m«w!al *4 p k hy 

the fact that the o jticMmori dirii-ihuthtUM «4' p |«-rtiHjtnti’4 m |tnrtic-«hir fnnn 
uniform o pfiari prohahilityl viirv widfiy tn with diffhrfrii valuofi of 

the »r’!8. This » indicatwi hy the <’«imjwri«»n f»f thm* •'■nHtputf'd nu'unn 

with corresponding modal wiiinateai «n iht Imt mdioH f*f fhi'f* |m|ier, '{’he point 
is verified by ftseriesinf hdioratory siinipariwuis td tmvJ.'il i% Kioglar) 

estimates with dinwt plate wmnto m tin* !natrrt«ls. riojfftrtod olw'where 
(Gordon & ZoBell, 1939). Tlnw tlm ndationslnp ol sin* jistwh' to tlu* dintriimtiou 
of which it is a ttspKf^mmtative k « v«.ryin},' (juality, and tho inudo is dopriv-fd of 
any imme<liate Hignificaiwe. 

It has idrtMdy hwii mfutirmfHl that tlm partii tihir of p that is best 
adapted to tlm purpoM*« of liactoriology oorresjHitidH l<! thi* o^]«‘* tafion of the 
logarithm, on the ground that wt? ««» niorn int<<r«tHPfd iji tin* ratio of p in two 
samples than in tlie ahsohUo diiTiwmt' Further, siiu i* /> can have vuIuck from 
0 to CO, but logp from "■ ob Pi f :a, iho prolHdnltf y of log p low hh o|»i«irt utdty of 
being nearly normally distrilniPal that k denied to pi sinularly for corndatiou 
tanlv'Vh and for the ratio of two standard deviations have nearly 

normal probalnlity distriimtiorw, atui have Ihh'ij exteiwively mavi for that reason 
on Fisher’s recommendation. If we can fmd the exjieciation of hfip and the 
second moment of its {irobaliility dwtrihniuin. a nornml curve with corre- 
sponding mejvn and standard deviation Hhuuhl give a gwMl wjireamJtation of the 
law as a whole, 


III. The E.STIMATION* OF IMi p 

Using the Bayes-Laplaee fortntda with uniform prior probability for p, we 
obtain from (3) for the iioatorior probability liiHtribution of p, 


Pip] 


The expectation of logp is 


dfKn) 


(7) 


logp« hgpPip)dp, 


which, cancelling factors in (3) and (7), equals 

I'M 

log pe~9U-Wiino-ni~M«^,,)p ^ j „ g-io^^n is ( J ( f - ip 

log? 

e-au~wn„-«i-(wm.i)p _ e-iopjR„ ^ j „ ^-^ipjtkxip 

V 0 


( 8 ) 
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Tilt! intfigraiiflH cau obviously bo expressed in the form 

Ikigp or 1|(I ” Ayo.ie-iVv^ 

where E is flie Bfjolean operator = (1 +/i), A moaning differenoo with regard to 

X(, ™ 1110— lO()tti0— 10^1 

iSiiu'o E is commutative with the sign of integration, we may therefore write 

( 1 ■“ ( 1 - A’lO)"! ( I - log p c - dp 

logp = ili-:- . 


( I ™ Emyin (1 _ (1 - e - iVoP dp 

Jo 


(9) 


That this is ho may be seem by exi)auding the o])erator 

//«{!- (1 (1 _ jjjyHa ( 10 ) 

as a [lolyiiomial in E, then operating term by term on the integrands. The result 
will he tfio same as i lie result of exj)anding the integrands of (8) in powers of e. 
The inlcgrulH in (!l) may bo <letcrinincd by means of the following identities; 






U^^f(^U+l)-hgX), 


( 11 ) 


where f(!t+ 1) is the *'digaramafimction”definedbyf(u+ 1) = dlogr(w+l)/dw. 
(see Bnikh Ammation faliks, voh 1 ( 1931 ) with a different notation). 

Place u =i (j in these identities; there results 


g "■ dp 


10 


log p e ~ rfp = 1 ) ~ log ^^5) , 

where P(l) «= -0-67722. 

In this way we obtain simply from ( 9 ), for logp 

|loga!o 


( 12 ) 


H 


logp = f(l)+logl0' 


*0 


H 


( 13 ) 


where H is the operator defined in (10). 
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IV. Thk KTAN'iiAitn itKViATiujr «ir lag/J 

T)ie BtftiKlartl rtoviatiori d Ing/i m tlic HCfUtim r«H»t (tf the second mnmnnt tif 
logp about its mean hgp. I'Kiiig tliu of tlie |»rt!vfo«,Hi aeotion, 

with staudard forimiloc of atatialiw, we nitUun 






II ' 


H 


(■ ~ dp 


(14) 


Now, M before, 

) 

(log/))'**!"" ^^ePdp = 


■//2 ' 

,,«fi ifjtiP dp 

p «\ « 


M-’-t) 


U) 


whence wc easily find 


I'lK i> 


0 

.-‘i{log 10 + /"1 1 )) log/fl + {log .fo}®), 


■(Iog7i)^ + |/(1H-Iogl0p.|f(l) 


(15) 


■2[f(l) + l()gl()l 


I ■I'b 


// 


II 


+ • 


( ((»ga:o)^ l 

I 


B 


(16) 


where 


F{\)« ~-{)-577216, /(!)= 1-64‘«)M. 


V. Trahsformationh 

In the fractional terms in (13) and (16), the numerators and denominators 
each represent finite alternating series; but these series are useless for eomputiiig 
purposes, because the terms are so nearly equal that each would have to be 
expressed with at least twenty significant figures to yield sufficiently accurate 
results. Hence we carry out the following transformations: 

Referring to (10) we take out (1 ~ = { - d as a factor from the operator 

E, and obtain 

H = Ho(--^)W (17) 

where N = Sn = Wio+%+%i, and 

Ho = (l + J?? + ... + jgM)«..(i + ^+... + F)«>. 


(18) 
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If this last expression is expanded, evidently all terms must turn out to have 
positive coefficients; so that if the functions 



can be accurately determined, the required end result is simply a weighted sum of 
terms of like sign, and there is no cancellation. 

In the first place it is very simple to obtain 



{x + N)\ 



by mathematical induction. 

To obtain ( — d)^(loga;/a;) it suffices to expand log{x+u)l{x + v) as a power 
series in v and operate with ( - d termwise oh successive powers of u = 0, using 
the formula 


= 


iP' P(-K) . 
{p-N)\ 


( 20 ) 


where are the “generalized Bernoulli numbers” of order {~N) and degree 
{p -N) (cf. L. M. Milne-Thomson, 1933, p. 134). By this means we finally obtain 


^ ^ a: “ {x + N)\ 


log® 


« (_1)P-A+1 p\ 

{p-N)\ 



By similarly treating log {x + N ~v)j{x+N~rv) it is easy to obtain also 




log®_iV!(®- 1)! 
X {x + N ) ! 


log(®+iV) 


p\ 


p°iV 


{x+N)^+'^(p-N)\ 


^+2+-” 


These formulae (21) and (21a) can evidently both be used simultaneously to 
obtain upper and lower bounds to the required result. Both are convergent if 
x>N. 


The sums 


,11 1 
^ + 2 + 3'*'-+p 


in (21) and {2la) can be computed directly, or by means of various short formulae 
of which the following may be suggested (Boole, 1872, p. 92): 


1+1 + ... + - = 0■577216+log^3 + i- 
2 


I 

12/ + 


1 

120 /”" 


( 22 ) 
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The Bernoulli numbers are easily computed by means of the reversion formula 
(Milne-Thomson, 1933, p. 129), 






(23) 


1 + 


N 


together with the values 5® = 0 if BJt"’ = 1. Also polynomial formulae 
are given in Davis’s tables. We have computed them as far as W = 27, v — 7, 
of which a tabulation is presented at the end of this paper. 

((log*)®) 


To deal with the function 


we make use of Boole’s operator identity 

= (24) 

If this is expanded in powers of djdx and made to oj)erate on (log x)^lx, the result 
is a series in the derivatives of the latter function. These derivatives are found 
to have the form 


dP (loga;)^ _ (- l)^p!(loga:)^ p 


dx'>^ X 

and the final result is 


Y'V’\'X 




p\ 


pi-u) 


X p=N 

In an analogous manner we also obtain 


21oga;B<jk+x^) , 


(p-N)\ +(j,_2)! 


(25) 




(l0giP)2 


= (-i)'"2: 


p\ 


ni-s) 

^P~N 


[log(xH-W)]“ 


j,^^,{p-N)\{x + Np+ri 
^ {-lp2\og{x+N)B^SPP ^ (-l)^Bfe_V | 


(25a) 


(p-1)! ‘ (p-2)! /• 

As (21) and (21a), so also (25) and (25a), can always serve together to give upper 
and lower bounds to the values sought, so long as Xq > N. 

For use in the last equations, the following formulae may be found useful: 


/ 1 1 i\ 

>-1)! ^ •' 


iP 

Sit 

b-2)! 


- „■ 2)-l 1 P-T 1\S-1 

= (-IF S - S - — 

rs=l ^ sssi ^ 


(26) 

(27) 


(i) Expansim of the operator 

The operator (equation (18)) taken as a polynomial in E, has the form of 

|)o(2) = (l+2+...+Z»»)»xo(l + aq....+29)«i 

_ (l-z““)™i»(l-a“)»i 
(1 — z)™“+”'i 

= Po + -Pl2 + ?2aH ... +P„2"+ ... 
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where P„ = 


with 


•Olu ’lu—lOr 

.„=s s (28) 

yssO s=0 

6 ! 


nb _ 

“ a\{b-a)V 


Thus, with these values of the P„’s, we have 

Po = ^o + -Pi^+-P2 + -• 


(29) 


(ii) Gomputation of P„ for large v 

When u is fairly large, that is, considerably different from both 0 and 
99%o + 9^1 + 1, the formula (28) becomes unwieldy and inaccurate. By means 
of a Laplace transformation, that is, by making use of the so-called “character- 
istic function” for P„ (see Uspensky, 1 937, chapter xil), we are able to obtain the 
following Fourier series representing the continuous function which is obtained 
by plotting the points (u, P„) and then connecting them with straight lines: 


^ 99nio+9?ii- 


® 27T/i . 

-u= I. -A;,gos-£ {u-b), 


fjt^Q 


(30) 


where 


and 


/f = 99^10 -1-9% -1-2; 6 = |(99%(,-t- 9%); 


^ 102«lo+™i 

“ 99%o+9% + 2’ 


A --A_ 


/ . 1007r«\»io/ 

p-z-) ( 


. 107r'M\ 


Til 




for p=^0. 

If p/K is an integer, ^ 0, then 


(™f) 


Otherwise obviously 






sm 


TT/t 

Z 


SttV 
1^ l<-^ 

I ' SttV 


lQ2?i,io+ni / TT \2 . ][ 

•= ' tov (™m) it 

10"iZ / 


/ . n N-ttio+a 1 ^1 I 1 

/ . TT \-»io-«i+2 1 

^sm-j ,^\plK-n\^-. 


The first limit is the greatest. It follows that 

.S, . 27r/4,,. , 0-000247 X 108»io+»tiZ ” I 

S ^ cos~^(g^6)< S \A^\< r-j S -i 

ft=p+i fi=p+i p-p+iP' 

0-000247 X 102»io-^»iZ f” dx 0-000247 x lO^ttio+^Z 


87r2 


■I. 


Sn^p 


(31) 


which can serve as an error function for (30). 
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VI , Remarks 

Sufficient mathematical tools are presented in equations (19), (21), (21a), 
(25), (25a), (28), and (30), for actually computing the geometric mean estimates 
p of densities of bacteria from the results of successive dilution experiments, 
together with the standard deviations of their natural logarithms, from formulae 
(13) and (16). The standard deviation of the logarithm of course will be a measure 
of the proportionate accuracy of the estimate, in the same sense that the ordinary 
standard deviation is a measure of the arithmetical accuracy. (It may be remarked, 
incidentally, that all logarithms occurring in the above formulae are understood to 
be natural logarithms.) 

If we assume in general that the function P{p) defined in (7) can be very nearly 
approximated by a normal distribution function with respect to logp ^which 
should be satisfactory for bacteriological purposes — then corresponding to 
ujogp there will be a "probable error” 0-676criogp = s. To this will correspond a 
“probable error ratio” 

e = e''-l = 10«’293“'-l, 
that is login ( 1 + e) = 0-293criog p. 

The meaning of e is that the probability is approximately that 

To save labour, it remains yet to determine limits to the extents of errors 
committed by adding only, say, every twenty-fifth term corresponding to the 
operator = 

(») 

in making the computations; or else to determine short formulae for the required 
'sums, For the purposes of rigour, upper bounds will also have to be determined 
for the Bernoulli numbers used in the formulae. 

Mrs Naomi Lancaster has made rough computations of several values of p 
as shown below: 


Argument 

P 

computed 
by us 

5 from 
Halvorson 
& Ziegler 

% deviation, 
from Halvorson 
& Ziegler 

»10 

■ 


10 

7 


143 

■■■ 

-7-0% 

8 

6 




+ 9-0% 

4 

2 

1 

0-086 

0-080 

+7-6% 


If we were to plot these percentage deviations against %o> should be led to 
expect a maximum positive deviation of perhaps 12 % or more in the region of 
Wio = 6 or 7. 
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Dr ZoBell has compared plate counts with corresponding estimates from 
Halvorson & Ziegler’s tables, and their ratios show exactly similar trends to 
those indicated by the above three comparisons, but relatively even more pro- 
nounced. These will probably be reported elsewhere in detail. 

In order to make possible the practical use of these results by bacteriologists 
and others, as well as to test their validity experimentally, it will be necessary to 
prepare tables of p and e as formulated in the preceding pages, corresponding to 
the tables previously prepared by Halvorson & Ziegler. This wiU require financial 
aid from some source. 

The writer wishes to express his appreciation to Dr C. E. ZoBell for acquainting 
him with this very interesting problem, and to Dr George E. McBwen for his 
encouragement and occasional aid in carrying out the work and making available 
the necessary hterature. Dr Harry Bateman of the Mathematics Department, 
California Institute of Technology, and Dr Risselman of the University of 
California at Los Angeles, were also helpful in bringing the analysis through one 
or two difficult points. 


APPENDIX 

Table of BernovlU numbers 


11=: 

1 

2 

3 

4 

6 

6 

7 

N= 1 

1 

1 

1 

1 

1 

1 

1 

2 

3 

4 

6 

6 

7 

8 


1 

7 

3 

31 


127 

86 

2 

6 

2 

15 

3 

’I? 

12 


3 

6 

9 

43 

69 

3026 

311 

3 

2 

2 

2 

5 

4 

84 

4 



13 

10 

243 

185 

■ 6821 

1326 

4 

2 

3 

10 

3 

42 

3 


5 

20 

76 

331 

675 

11216 

6226 

6 

,2 

3 

4 

6 

4 

21 

3 

6 


19 

63 

1087 

777 

30083 

6432 


2 

2 

10 

2 

21 


7 

77 

49 

1939 

4763 

9992 

43120 

7 

2 

6 

10 

6 

3 

3 

8 

4 

60 

3 

72 

4819 

16 

1476 

146240 

21 

33664 

g 

9 

21 

405 

2614 

2666 

93886 

673976 





8 

2 

4 

6 

7 

10 


155 

276 

762 

12650 

677465 

667326 

6 

6 

2 

3 

28 

4 

11 

11 

187 

363 

16258 

26499 

1168509 

3164029 

2 

6 

2 

15 

4 

28 

12 
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Table of Bernoulli numbers (cont.) 



1 

2 

3 

4 

6 

6 

7 

N=12 

6 

37 

234 

16149 

10 

10023 

2842373 

42 

466726 

13 

13 

130 

1183 

20631 

176267 

2238730 

9453363 

2 

3 

4 

10 

12 

21 

12 

14 


.301 

736 

82439 

41895 

976623 

7704026 

7 

6 

2 


2 

6 

6 

15 

16 

2 

115 

2 

450 

7181 

2 

29176 

10129615 

42 

2026650 

16 

8 

190 

544 



7330868 

9329056 

3 

6 

3 

21 

3 

17 

17 

221 

2601 

87601 

106641 

10384229 

37236205 


3 

4 

16 

2 

21 

8 

18 


165 

1639 

.36483 

70281 

19239635 

27257671 

■1 

mgm 

6 

28 

4 

19 

19 



46040 



39123375 

... 

6 

2 

6 

12 

28 

4 

20 

10 


1050 

66049 

117076 

17672416 

41371226 



3 

6 


14 

3 

21 

21 

112 

4851 

133217 

7714707 

6022439 

38264149 . 

2 

4 

10 

62 

3 

2 

22 

11 

737 

2783 

169819 

1116983 


78466686 

6 

2 

10 

6 

21 

3 

23 

23 

806' 

1687 

570663 

461817 

69687136 



2 

6 


2 

21 

3 

24 

12 

146 


112379 

6 


76306988 

21 


26 

25 

475 

8125 

26380 


96763700 

1486136626 


2 

3 

4 

3 

21 

24 

26 

13 

1027 

6 

4563 

2 

461578 

16 

419796 

486388487 

84 

322887513 

4 

27 

27 

369 

6103 

178462 

2019087 

202034677 

416849896 

2 

m 

mm 


4 

28 

4 
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NOTE ON THE INVERSE AND DIRECT METHODS OF 
ESTIMATION IN R. D. GORDON’S PROBLEM 

By E. S. PEARSON 

The practical solution of the problem of estimating the mean density of a bacterial 
population by the dilution method requires not only the determination of 
a single-valued estimate, but also some measure of the reliability of this estimate, 
By the introduction of an ingenious mathematical procedure Mr Gordon has 
taken, in the preceding paper, a first step in the solution of this problem on 
lines involving the application of Bayes’s theorem.* Thus if p is the mean 
density per unit volume at a given dilution, he obtains a^n a postmori distribution 
for p and, since it seems likely that the derived distribution of logp, rather than 
that of p, will be approximately normal, he shows how the expectation and 
standard deviation of log p may be calculated. With the help of these a probability 
statement regarding the unknown p of the form given on p. 178 may be made. 
To put the results into working form extensive computation will, however, be 
necessary. 

Contrasted with this inverse approach is what may be termed the direct 
solution involving the determination of a fiducial or confidence interval. This 
solution requires {a) the choice of an appropriate sample estimate of p, say E, and 
[b] the determination of its sampling distribution, say p{R \p). Mr Gordon refers 
to R. A. Fisher’s maximum likelihood estimate, say (Fisher, 1922, pp. 363-6), 
whose calculation in the case where ten tubes are examined at each of three 
dilution levels, the dilution factor being 10 : 1, has been made easy by the tables 
of Halverson & Ziegler (1933 a), but convinced as he clearly is that the inverse 
approach is the only legitimate one, he no doubt did not feel it necessary to 
discuss the possibility of a fiducial solution as alternative to his own. Since, 
however, a paper by Matuszewski et al. (1936) did present a preliminary working 
solution of this kind for just the same experimental arrangement— I will call it 
the 10, 3, 10 arrangement-discussed by Halverson & Ziegler and by Gordon, 
it may be useful to make some reference to this result, and also to add a few 
comments on the difference between the two lines of approach. 

In their first paper (1933a,p. 121) Halvorson & Ziegler gave a table showmg the 
value of Rx, corresponding to each (or rather, most) of the possible combinations 
of %o, Wi, and the number of tubes (in Gordon’s notation) out of the ten 
tested at each dilution which show growth. In their third paper (1933c) they 
carried out some investigations into the sampling variation of for a fixed p. 

* It is interesting to note that the pioneer paper on this subject by Greenwood & Yule (1917) 
also followed the Bayes’s theorem approach. 
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Taking the 10, 3, 10 arrangement and the four cases p = 0-15, 0-26, 0-50 and 
1-50, they calculated from the term of an appropriate multinomial expansion 
the probability of observing different combinations of Wio, and hence, 
nsing their tables, they obtained the probability associated with difi'erent values 
of iJj,. Their results, presented in the form of tables and a diagram, show that: 

(1) The distributions of are asymmetrical. 

(2) The distributions of logE^ are more nearly symmetrical, but since 
can only assume a finite number of discrete values, the distributions cannot in 
either form be represented adequately by smooth curves. 

(3) Calculations from the partially grouped values give the results in Table I. 


TABLE I 


p 

0-160 

0-260 

0-500 

1-600 

Mean (Ji^) 

0-164 

0-284 

0-658 

1-648 


0-066 

0-117 

0-226 

0-689 

logp 

-0-824 

-0-602 

-0-301 

4-0'176 

Mean (log R^) 

-0-816 

-0-678 

-0-285 

-i-0-184 

or(lOgfiji) 

0-183 

0-164 

0-163 

0-168 

ff{logBi) (limiting 

I formula) 

0-1635 

0-1632 

0-1768 

0-1660 


N.B. Logarithms are to base 10. 


(4) The values of (r{log Rj) calculated from Halvorson & Ziegler’s probability 
distribution remain nearly constant in the range of p considered. Having regard 
to the respective standard deviations, the bias of logE^ is of less importance 
than the bias of Aj,. However, if a method of obtaining accurate fiducial limits 
were available, the bias in the single-valued estimate would be of no im- 
portance. 

(6) The figures in the last row have been calculated from Fisher’s formula 
for the large sample value of the variance of a maximum likelihood estimate, 
which reduces in this case to (Fisher, 1922, p. 364): 


{o'^ilog.Fi)}-! = 10 


g-lO/) 

l~e-i»/> 


lOOpH- 


e-/" 


-e~p 


pH 


g-iVp pi\ 

lOOj 
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This formula gives minima for cr{\ogRj) in the neighbourhood of /) = 0T6, 1*6 
and 16. A series of calculated values including those tabled above are; 


TABLE II 


p 


0-16 

0-26 





2-00 






0-163 




0*166 

■ 

0-156 

0-163 



0-174 


The probability values tabled by Halvorson & Ziegler are not sufficiently detailed 
to determine how far the differences between the values of cr(logBjrJ compared 
in the last two rows of Table I are due to inadequacy in the large-sample variance 
formula. It is, however, clear that the standard error of the logarithm of the 
maximum likelihood estimate changes very little with p. Halvorson & Ziegler 
reached a similar result by noting that was very stable. The same result 

was noticeable in a series of calculations concerned with estimating the density 
of organisms in milk, made by Barkworth & Irwin (1938). These authors analysed 
the results of seven separate experiments in which 266 tubes were tested at each 
of four dilutions, namely 1 : 10, 1 : 50, 1 ; 250, and 1 : 1250. They obtained the 
standard error of the maximum lilcelihood estimate of p from a large sample 
formula analogous to- that given above; for seven values of p lying between 
19 and 67, the ratio (t{Rj)Ip lay between 0-068 and 0-062, 

It is clear that if more detailed computations of the probability distribution 
of Rj^ for a wider range of values of/? were made on these lines, charts or tables 
giving fiducial or confidence limits for p could be readily supplied. In their paper 
of 1935 referred to above Matuszewski etal. have provided one such chart formed, 
as will be described below, on a basis which is partly empirical. The chart may be 
used as follows: 

(cf) Having observed experimentally the three numbers 
obtain from Halvorson & Ziegler’s tables the maximum likelihood estimate, 
Rl, of p. 

(b) Taking B^, read off from the chart lower and upper confidence limits, 
8aypi(ii!jand/?2(2i:i).* 

(c) Then, using Neyman’s terminology, the statement 

Pi{Rl)<P<P%{Rl) 

may be associated with a confidence coefficient of 0-96. In other words, if this 
procedure is applied in general bacteriological practice to the three frequencies 
%> and Uq.i obtained from the dilution method, then the odds wifi be at 

* The authors take A' as the maximum likelihood estimate of mean density A in the most 
concentrated of the three solutions, so that \'=V)Rl. 
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least* 19 : 1 that the interval pM will cover the unknown mean 

density p. 

It should he emphasized that the chart provided by Matuszewski et al. was 
not based on an exact mathematical solution, but found by graduating a series 
of experimental sampling results; the following quotation from their paper 
(p. 76) explains the method employed: 

The method followed by Miss J. Supihska consisted in a complex sampling experiment, 
using Tippett’s random sampling numbers. The experiment produced a series of values 
of the variates jko, ajj and [i.e. n^o, ttj, and Wci in Gordon’s notation] following the 
sampling distribution which they would follow in our hypothetical conditions of the 
experiment. For each series of X(„ and aj it was possible to read up from the table of 
Halvorson & Ziegler an estimate, say A,', of the edneentration A. The estimate,? A' have 
been then tabulated and an empirical frequency distribution of A' corresponding to several 
fixed values of A has been determined. Following the method described by J . Neyman, 
these empirical frequency distributions were then used to construct confidence intervals 
as if they were the accirrate ones. As the random variation could not fail to affect the limits 
of the intervals it was felt necessary to correct them by fitting two parabolae, one marking 
the lower and the other the upper limits of the confidence intervals. 

While therefore the writers did not claim exactness for their results, their chart, 
combined with Halvorson & Ziegler’s table, does provide a provisional working 
solution not yet available to those who prefer the inverse approach. 

It is interesting to obtain from the chart the limits piiRi) 
three hypothetical cases discussed by Gordon on p. 178 above. In Table III these 
limits and also their logarithms are given. 


TABLE III 


Observed 

frequencies 

Gordon’s 

estimate, 

P 

Maximum 

likelihood 

estimate, 

log Bl 



95 % confidence limits 



«to 


«o-i 

Pi 

P2 

L 

kg/>i 



log/Sa 


10 

7 

3 

143 

1'53 

H 



IHI 

0477 

Blil 

8 

5 

1 


0'267 





KiMil 


4 

2 

1 

IHI 

0'080 




-1-64 

HI 

Bill 


It win be noted that; 

(1) The 95 % confidence interval for p is in all cases relatively broad and, 
having regard to this, the differences between the single valued estimates p and 
El are of little importance. 

(2) While neither p nor Bi is central with regard to the interval, log J?jr 

* 0-96 is a lower limit to tbe probability of a correct statement, owing to tbe discontinuous 
distribution of the frequencies. 
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differs only slightly from |(log yOi+ log pg), as will be seen by comparing the 6th 
and last columns. The length of the interval logpj-logpa changes slowly, 
increasing as Rj^ decreases. A study of the chart shows that for roughly in the 
range 0-6-2-4, the breadth of the interval logpa-logpi remains nearly constant 
at a value of about 0-55, increasing slowly as Rj^ drops below 0-6 or rises above 
2-4. Without further details of Supuiska’s sampling experiment, it is not possible 
to make a closer comparison between this confidence chart based on random 
sampling and the four distributions of JBj, tabled by Halvorson & Ziegler. Both 
results suggest, however, that for a considerable range of values of p, logjo^Rj, 
is distributed about a mean value of approximately log^op with a standard error 
of about 0" 16. 

If finally it he asked whether the direct or inverse solution is to be preferred, 
the answer must, I think, be that this can only be a matter of personal opinion. 
As mentioned in the footnote on p. 181 Greenwood & Yule (1917) preferred the 
latter method. The fundamental difference of the two methods of approach has 
recently been emphasized by Harold Jeffreys, with whom it is to be supposed 
that Gordon is in substantial agreement. Writing with regard to the a priori 
distribution of an unknown parameter, such as p, Jeffreys says (1938, p. 466): 

I can find nothing in the works of the pioneers of the principle of inverse probability 
to suggest that they identified the prior probability with a known frequency, and believe 
that if such an idea had occurred to them they would have repudiated it as definitely as 
I do. The function of a prior probability used to express ignorance is simply to express 
formally the transition from an inference about different possible data, given the hypo- 
thesis, to one about different hypotheses given the same data, and thi,s tri^nsition must 
be made somehow on any theory. 

it follows that if the prior probability distribution cannot be identified with 
frequency, neither can the posterior distribution. A probability, in J effreys’ sense, 
obtained from the integral of the posterior distribution between limits p^ and p^, 
can he regarded as no more than a rational measure of the degree of belief that the 
experimenter may place in the truth of the statement pi <p < p^. It can have no 
precise link with long-run relative frequency. This, indeed, it could’ only have 
if there were reason to suppose that in repeated dilution experiments the popula- 
tion value of p would be distributed uniformly between 0 and oo.* But the 
legitimacy of any attempt to make this connexion Jeffreys has denied emphatic- 
ally. For him, if I understand rightly, the posterior distribution derived from 
the formal prior distribution provides the one type of numerical sealing which 
he regards as useful in forming an opinion on the value of an unknown constant. 
To quarrel with this conviction would be out of place. 

On the other hand, those who agree with Jeffreys must recognize that the 
direct approach with its fiducial argument has resulted from the development 

* Grordon has assumed this to be the most appropriate form' of prior distribution for pj possibly 
Jeffreys would prefer to make the distribution vary as 1/p. 
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of a line of thought which finds probability statements useful in a form in which 
they can be directly related to long-run frequency of occurrence. To know that if 
a certain experimental technique is carried out and an arithmetical calculation 
made, then there are strong grounds for believing* that about 19 times out of 20 
the limits p^{R£) will include the unknown p is a form of information 

which appeals to a large number of statisticians. In this and other problems it 
seems likely to appeal particularly to persons who are frequently repeating the 
same form of operation, and can therefore the more readily appreciate the con- 
sequences of “being wrong” once in 10, once in 20, or once in 100 times, according 
to the risk they choose to allow. It is true that in certain instances there may be 
other special information which will enable the experimenter to guess at narrower 
or modified limits, rather than those obtained from the standard fiducial pro cedure . 
But whether this information is ever of a kind which can be put into numerical form 
is doubtful. Certainly this would not be achieved by using the prior probability 
distribution proportional to dp or dpjp. The fact that we can sometimes narrow 
the range of uncertainty and so get nearer the mark, does not detract from 
the long-run “safeness” of the fiducial argument. Of course, whatever the 
approach, it is essential that the sampling should have been random. 

From the point of view of the bacteriologist this difference of opinion between 
experts may be discouraging, but it has been shown more than once that the two 
lines of approach lead to results which, from the practical point of view, are almost 
precisely the same. Thus Jeffreys (1937), starting with a prior probability law 
for cr of dcr/cr, has reached “Student’s” distribution for the posterior probability 
law for the mean; it follows that in this instance the fiducial limits of the direct 
approach associated with a confidence coefficient of, say, 0-96, will correspond 
exactly with those obtained from Jeffreys’ posterior distribution and associated 
with this probability measure of 0-95. It is to be hoped, therefore, that if Mr 
Gordon obtains financial support for the lengthy computation required to produce 
tables of p and e, he will at the same time make some research into the corre- 
spondence of his limits for p with those following from the direct approach. 
By this means he would undoubtedly widen the range of persons who could use 
his tables with confidence. 
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ON THE DISTRIBUTION OE MAXIMUM LIKELIHOOD 

ESTIMATES 


By B. L. WELCH 

Tub properties of maximum likelihood estimates have been discussed by E. Y. 
Edgeworth (1908), R. A. Eisher (1922) and others. Most important is the property 
that in large samples a maximum likelihood estimate tends to be normally 
distributed with a variance which is given by a very simple formula and that no 
other estimate can have a smaller variance. If we have a random sample of n 
from a population whose probability law, p{x | (9), depends on only one parameter 
6, and if T is the maximum likelihood estimate of 6, then under certain conditions 
it may be shown that T is in large samples normally distributed about 6 with 
variance IjnA^, where 



Similar formulae are available for the variances and covariances of estimates 
when there are several parameters, but only the single parameter case will be 
discussed here. 

When dealing with small samples the maximum likelihood estimate T is 
frequently adopted together with the method of approximating its distribution 
by referring it to a normal curve with mean 6 and variance IjnAg. The question 
arises whether this is an adequate procedure. This question splits into two parts. 
Eirst, we may ask how far we are likely to go wrong by assuming that T is 
actually distributed in the manner known to be correct in large samples: and 
secondly, we may ask how far the advantage which T holds over any other 
estimate in the matter of sensitivity in large samples is retained in small samples. 
The first of these problems has an interesting historical aspect. Karl Pearson 
(1936) has told us that because of his doubt of the adequacy of the approximation 
in finite samples he made no subsequent use of the above large sample variance 
formula originally given by him and L. N. G. Eilon (1898). However, the real 
reason for his view that the approximation is not good perhaps lay partly in the 
fact that he had not noted that the formula referred only to maximum likelihood 
estimates— a point not made clear until Edgeworth returned to the problem in 
1908. Indeed this question of how good the approximation is cannot be answered 
very definitely, for there is, as a rule, no general agreement as to how close an 
approximation should be before it can be termed good. 

The second problem has been discussed extensively by R. A. Eisher (e.g. 
1925), who concludes that the maximum likelihood estimate still retains in 
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smaU samples desirable properties which recommend its use in preference to 
other estimates. 

The present note is concerned only with the first problem and, in connexion 
with this, it is not intended to imply that the approximation by the large sample 
formula is not good but rather to give a method which may supply a closer 
approximation if this be thought desirable. 

Let us write p{x 1 0) equal to exp [f)(x, 6)] so that the maximum likelihood 
estimate T is given by 

Z(j>'{x,T) = 0. (2)* 

In general, the solution of this equation is not algebraically simple, although 
the numerical solution by iteration may not be difficult. It is therefore not 
possible to find the distribution of T, and indeed it is generally impracticable 
to find even the moments of T. However, usually the actual distribution of T 
is not required, but simply a method of calculating the probability that T shall 
exceed any specified value (say). It is often possible to find the moments 
of another quantity which will facilitate such a calculation. 

If the sample be represented by a })oint in 'U-dimensional space, then all 
samples yielding the same value Tq of T will lie on the hypersurface S Tq) = 0. 
Conversely, with most probability laws likely to be encountered, and certainly 
for that of the example given below, it will not be possible for this hypersurface 
to contain points yielding a maximum likelihood estimate different from T^. 
On one side of the hypersurface we may therefore expect T>Tq and on the other 
side T < Tq. Now since by Taylor’s theorem 

S<l>'{x,T,) = Nf («, T) + (T,^T)i:f(x, 7) + etc., (3) 

and since by definition of T we have T) ~ 0 and 2'^"(a;, T) negative, we 
shall expect T>Tg on that side of the hypersurface for which 
Hence if the maximum likelihood equation establishes a many-one relationship 
between sample points and values of T, we see that the probability that T>Tq 
is equal to the probability that r^'(a:,To)>0. But the quantity E^'[x,Tq) is 
the simple sum of a number of independent components, and its cumulants 
therefore follow simply from those of a single ^'{x, T^). Any of the usual methods 
of approximating to probabilities from cumulants may then be employed to 
evaluate P[S(j)\x, Tq) > 0} and hence P(T> Tq). 

As an example consider the probability law 

( 4 ). 

which has been discussed at some length by Edgeworth (1908). We have 
^'{x, 6) = 4(a;-i9)® and therefore 

E<f>'{x,T,) = iE{x~T,)\ (5) 

* The dash denotes differentiation with respect to 0 , and 2 denotes summation over the whole 
sample. 
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Now by a straightforward method it may be shown that the first three cumulants 
of - T^f, when the true value of the parameter is d, are given by 


where 


/fa = /42 = 12c + {^0~lUc^){d-Tof+U4:C(d~To)\ 
^3 = /J3 = 36(0 - T^) [(5 - 12c2) + {48c + 96c3) (0 - 

+ (36-144c2)(0-J’o)*], 

c = ^ = 0-33799. 


The cumulants of T„)ln are then /f^, /fj/m and K^jn^. By fitting a Pearson 
Type III curve with these three moments we can then approximate to 
. P{4l'(a:-To)7w>0}. If closer approximations are necessary further moments 
may be calculated and some other kind of curve fitted. 

More frequently, perhaps, the converse problem is posed of finding Tq such 
that P{T>Tq) has a specified probability. The above method then involves 
a certain amount of trial and error. For instance, suppose n = 25 and Tq is 
wanted so that P{T>Ta) equals 0-05. A first approximation is obtained by 
taking T to be normally distributed about 0 with variance 1/25A^, where 


I 


.4,=J{4(i-,9)>}V&=I2^ 


4-066. 


( 7 ) 


The approximate standard deviation of T is then C-0993 and is 0 + 0-163. 
Substituting this value of in (6) we obtain = -0-678, kJ25 - 0-210, and 
k^I&25 = -0-0392. The standard deviation of 4i;'(a:- 0-0-163)^/25 is therefore 
exactly 0-458, and the skewness aJPi = -0-407. Using these moments and the 
very convenient tables of L. R. Salvosa (1930) we find that approximation by 
a Pearson Type III curve gives P{42'(a:-0- 0-163)^/26 > 0} = 0-066. From this 
it is not difficult to see that, by adjusting Tq to the value 0+0-168, the Type III 
approximation to E<j)'{x,TQ)jn will give P(4i;’(a;-0-O-168)®/25> 0} == 0-05, i.e. 
P(T> 0+0-168) = 0-06. 

There is in the present example little difference between the results given by 
the large sample approximation to the distribution of T and the Type III 
approximation to the distribution of Z<j)'{x,TQ)jn. This is partly due to the 
nature of p(a: 1 0), owing to whose sjnnmetry the maximum hkelihood estimate, T, 
has a distribution which is exactly symmetrical with true mean at 0. The main 
approximation in using the large sample method therefore lies in the value 
adopted for the variance. Often the distribution of T will not be symmetrical 
nor have true mean exactly at 0. It is in such cases that a method which uses 
exact values of the lower order moments of Z<p'{x, T^)ln may be expected to 
yield the greatest improvement in accuracy. 
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ON NEYMAN’S “SMOOTH” TEST FOR GOODNESS OF FIT 


I. DISTRIBUTION OF THE CRITERION >fj^ WHEN ' 

THE HYPOTHESIS TESTED IS TRUE 

By F. N. DAVID 

1. Inteodttctoey 

The X® test has passed into common use since its introduction by Karl Pearson in 
1900. It has, however, long been recognized that in certain cases it is inadequate 
as a test for goodness of fit, and in particular in the case when the deviations of 
the observations from the hypothesis tested are consecutively positive (or nega- 
tive) ; for the test, by taking into account the square of the difference between 
the observed and expected values, renders it impossible to pay attention to the 
sign of this difference. It is because of this inadequacy that Neyman (1937) 
introduced the “smooth” test for goodness of fit. This “smooth” test has been 
discussed by E. S. Pearson (1938) in some detail, which absolves us from further 
discussion here, but the procedm’e of the test will be set out in order that the 
substance of this present paper may be understood. 

Following the prooedureofpreviouspapers(1933, 1936) Neyman (1937, p. 156) 
insists that a choice of a suitable test for goodness of fit may only be made after a 
statement of the probabffity functions specified both by the hypothesis tested and 
by the admissible hypotheses alternative to that tested. In order to supplement 
the test he confines the alternative frequency laws to a class of functions which 
he terms “smooth”, and whose form it is supposed can be represented by trans- 
formed Legendre polynomials. The keystone of his test lies in a probability 
integral transformation previously given by R. A. Fisher (1932) and K. Pearson 
(1933), by means of which the distribution of n independent random variables 
may be made rectangular.* 

The criterion which may be used instead of if the admissible probability 
laws are smooth, is designed to test the departure of the distribution of the trans- 
formed variables from rectangularity. Before it may be apphed, however, it is 
necessary to choose what may be termed the order of the test, bearing in mind 
which departures from the hypothesis tested it is most important for the investi- 
gator not to overlook. The question of the appropriate order of test to choose has 

♦ For example if 

p{y) = constant for a < 2 / < 6'| 

=0 otherwise J 

we should say that y was distributed rectangularly in the interval [o; 6]. An alternative method of 
expression would be to say that in. the interval [a; 6] aU values of y are equally likely. 
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been discussed to some extent by Neyman (1937) and E. S. Pearson (1938), 
Further investigation into the matter has been carried out and the results will be 
published in a second part of this paper. Once the order of the test has been 
decided, then the calculation of the criterion is a comparatively simple matter 
and may be set out in the following way. 

Assume that there are n independent random variables 
following a known continuous probability law whicli is specified by ATq, the 
hypothesis to be tested, By means of the relation 

ran 

y = \ p{xjHo)dx, (1) 

J GO 

the n independent variables x are transformed into n independent variables y 
which are rectangularly distributed, whaUver the p-obabilily law of the x’s. If it 
is decided that a test of the ^;th order is to be applied, the next step involves the 
calculation of k quantities % (i == 1, 2, . . . , jfc) by means of the substitution of the n 
variables y which were obtained from (1) into the appropriate Legendre poly- 
nomials, TTf, where ^ 

Ui = n-i S Tiiiyj) i = 1, 2, . , ,, fc. (2) 

The first four Legendre polynomials* are 

7ri(y) = Vi2(y~i) nM=^^Tmy-W-^{y-i)) \ 

= = 210(j/-i)i-46(j/-i)a-ff/ 

The criterion consists of the sum of the squares of these w’s, thus 

= (4) 

i=l <= 1 X 5=1 / 

Any such criterion is, however, of little use if its distribution is not known. An 
approximation to the distribution was given by Neyman and it is this approxi- 
mation which is reviewed in the next section. 

2. Distribution op 

It was shown by Neyman that when n, the number of observations in the 
sample, is large, then each of the quantities u\ (< = 1, 2, ..., /c), is distributed 
independently as yf with one degree of freedom. It will follow therefore that for 
large n the criterion of a ifcth order test will be distributed as with k degrees 
of freedom. As with many statistical tests based on the assumption that n is 
large, the decision as to the numerical value of n is left to the user of the test and, 
with the introduction of the personal element, we get a variation in ideas both as to 

* It is of interest to note that a reourrenoe formula for these polynomials follows direotly from 
a general formula for Legendre polynomials (see for example Whittaker & Watson, Modem Amly sis, 
p. 302, n). If z=y ~\ then 

7r„+i(2)-2^ V(2n+I) 5r„(»)+-p^ Wz)=0. 
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what is meant by a large sample and the point at which a large sample becomes a 
small sample. It is therefore useful to obtain some idea of the degree of approxima- 
tion involved in the supposition that is distributed as Although this may 
only be investigated completely by the discovery of the true distribution of 
it is not difficult to find Pearson curves which have moments agreeing with the true 
moments of and these may be used to throw light on the approximation to the 

distribution. 

The derivation of the true moments of is a matter of simple but tedious 
algebra. Usuig the notation^ to denote mathematical expectation, it is seen that 
we must calculate 

= ^(^2 -<^(f *))' for i = 1, 2, 3, 4, (5) 

where is as defined in (4) and the ?/’s are independently distributed within the 
interval [0; 1] in accordance with the rectangular law. I have calculated only the 
moments of for first and second order tests, partly because numerical work* 
leads me to believe that these order tests are more powerful to detect owy depar- 
tures from the basic hypothesis than any other higher order test, and partly 
because the necessary algebra became so very heavy. 

From the properties of the transformed Legendre polynomials we know (see 
Neyman, 1937) that 

^(^i(2/i)) = == 0 for any i and j. (6) 

Also since the observations are independent of one another 

= 0, (7) 

and <f(7rf(yj)) = 1- W 

In general it wiU be noticed that we require to calculate 

( n \2l 

S ^•(2/^)1 for i = 1, 2 and 2 = 1, 2, 3, 4 (9) 

and 

« n \2p / n 

j for im = 1,2 andjp. y - 1, 2, 3, 

( 10 ) 

The calculations required are straightforward and follow at once from the applica- 
tion of the elementary theorems concerning the sum and product of expectations. 
The final results are given below in Table I. 

It win be noticed that for n small the moments of and differ considerably 
from those of x^. It is, however, only fair to point out here that Neyman supposes, 
as in the case of the x^ test, that a “smooth” test for goodness of fit would not be 

♦ To be discussed in Part II of this paper. 
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TABLE I 


True momenta 
of 

(*=i) 

ipi—vTi+ul, {k=2) 


1 

2-1 


2 

4_12 

3571 

Mi 

6n 


704 722208 
^^■*■4977 3603671* 

Mi, 

“ 6» 

432 

671.* 

,,,,15216 2203468,17946980 

4971 35035^2 '^119119n3 

j 


carried out except on a large number of observations. It seems to be possible by 
an approximate method to obtain some idea of how large n should be. 

A consideration of the /^-coefficients calculated from the moments of Table I 
suggests that the most appropriate curves to use would be a Pearson Type I for 
and Type VI for ^|. Since, however, it was hoped that both curves would approxi- 
mate quickly to the Type III distribution, as a first step Type III curves with 
the start of the curve at the origin were fitted to the moments of both and 

As there are only two constants to evaluate for the Type III curve with zero 
start, the procedure was equivalent to finding a x^ distribution with the mean and 
standard deviation equal to the corresponding true values for Eor purposes of 
comparison, the 6 % and 1 % levels of x* with/ == 1 and/ = 2 were taken, and the 
tail areas of the fitted Type III curve corresponding to these abscissae were 
calculated from Tables of the Incomplete P-Function (K. Pearson, 1922). The 
results are summarized in Table 11. 

TABLE II 


/=1, }d.«i=6'636 



5 

1-76 

0-0623 

0-0098 

10 

1-88 

0-0611 

0-0099 




H 


/=2, /g.»=5-991, xLi=9-210 

n 

Mi 

p{iri>s-mi} 
P{V1> 9-210} 

5 

3-8171 

0-0611 

0-0100 




ui 

100 

3-9909 

0-0500 

0-0100 
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The probability levels of as given by the empirical Type III curve approxi- 
mate quite quickly to those of x^, and at first sight it would appear that for any 
size of sample little risk would be involved in testing the significance of by 
means of the tables. However, it must not be forgotten that the empirical curve 
is itself an approximation to the unknown true distribution. To obtain some 
idea therefore of the approximation involved it is necessary to calculate the 
/?! and y?2 of the fitted Type III curves and compare them with the ^’s of the 
true distribution of and of the x® distribution. These values of the /?’s are 
plotted in the diagram and their numerical values are given in Table IV. It will 
be noted that the y?’s of the fitted Type III curves lie nearer the points (8, 15) 
and (4, 9) than do those of the true Hence, while the results of Table II suggest 
that the approximation of the distribution by a Type III distribution is a. 
fairly good one, it would appear that further investigation is advisable before any 
definite conclusion may be drawn. 

It was therefore decided to utilize the true third moments of Type I 
curves with zero start v'^ere fitted using the first three momenta of ^|, and similarly 
Type VI curves using those of ir\. It will be recognized that the procedure for the 
fitting of a Type VI curve and the evaluation of its integral is very similar to that 
for the fitting of Type I, and the calculations required were almost the same for 
both curves. Accordingly it is necessary to set out the steps followed for the case 
of Type I only. 

If the equation of the Type I curve is written as 


Z 


1 ■+• WI 2 "t 2) 

(jOTi+i -f 1) /’(mj + 1 ) 


1 


'a] 


( 11 ) 


easy algebra will give 




(wi+1) 


M'i = 


(% + 2) 

(mi-l-TOa-l-S)’ 


= aM'^ 


(mi+m2 + 4)’ 
( 12 ) 


where My M'^ and M'^ are the true moments of f\ about the origin. It was not 
found possible to write down the solutions of a, and mg in any neat form, and 
the procedure followed was to substitute numerical values for M'y M2 and M3 at 
an early stage. The partial integral of (11) is recognized as an Incomplete Beta- 
Tunction. Hence to calculate the tail areas corresponding to Xo-os AIo-oi 11^ will 
only be necessary to refer to Tables of the Incomplete Beta-Function (K. Pearson, 
1934). Table III gives the results of interpolation into these tables. 

It will be noted that for strict accuracy triple interpolation would be necessary . 
This, however, called for much calculation, so an approximation was made by 
taking = - 0-60 and using a double interpolation formula only. The effect of 
the approximation is to make the interpolated values slightly greater* numerically 
than those values which would have been obtained by the use of triple inter- 
* A rough form of quadrature for the oase/=l, n~5 gives P{^>3'841}=0-0481. 


13-2 



o® ® 





True distribution oi C Type III fitted to true moments ol Y‘ 

True disUlbution of O Type Ul fitted to true moments of V’ 

X distribution O Type I fitted to true moments of Ilf*-*' 

Type VI fitted to true moments of ' 


Fig. 1. Pj, points for true and approximate distributions. 



F. N. David 


197 


TABLE III 


/=!, x2..v.=3’841, ;^Li = 6’636 

/=2. xL=S-991 

XL=9-210 

n 

5 





10 

20 

h 

6’3944 


7-2971 

h 

18-0489 

17-2312 

16-6668 

a 


45'3643 

96-2356+ 

a 

16-2813 

30-8664 

60-3289 

Vly 





0-29947 

0-154503 

0-07826- 

mj 

8’63986+ 


46-07631 

(Mj 

10-57863 

17-81708 

33-62469 

?{^?> 3-841} 



0-050- 

?{f^> 6-991} 

0-047+ 

0-049+ 

0-050- 

P{fi> 6'636} 

H 

IIB 

0-010- 

P{^|> 9-210} 

0-010- 

0-010- 

0-010- 


TABLE IV 



Type of curve 






True values 

A 

6-34 



n 



A 

9-84 



H 


Pitted Type III 

A 

6-96 


7-74 




A 

13-42 

14-22 

14-61 

Hi 


Pitted Type I 

n 

6-34 

6-61 

7-62 

7-86 



y 

9-94 

12-30 

13-89 

14-71 

in 

True values 

■I 

6-86 

4-97 

■n 

4-10 



A 

14-06 

11-43 

m 

9-24 


Pitted Type III 

y 

4-30 

4-14 


4-01 




9-46 

9-21 

9-12 

9-01 


Pitted Type VI 

A 

5-86 


4-49 

4-10 



A 

13-74 

11-37 

10-17 

0-23 


polation formulae. As % increases gradually approaches the Talue - O’ 5, For 
values of n greater than 20 the required Beta-Function ratio fell outside the range 
of the existing tables. These values therefore could not be evaluated except by 
quadrature but the results for % = 5, 10 and 20 give sufficient indication for our 
purposes of the approximate error involved. A system of triple interpolation was 
carried out fox the evaluation of the integral of the Type VI curve but the calcula- 
tions becam e so very heavy that it was only found practicable to obtain the result 
correct to three decimal places. The results of interpolation were confirmed 
approximately by a rough quadrature of the curves. The tail areas of the Type I 
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and Type VI curves beyond are, seen to be slightly less than 0-06 but hearer to 

that value, than the corresponding areas from the Type III approximation shown 
in Table II. On the other hand the Type I and Type VI areas beyond ;^^oi are a 
little further from 0-01 than for the Type III approximation. Except perhaps in 
the case of w = 5 of the Type I curve these differences can hardly be regarded as of 
importance. To judge how closely the Type I and Type VI curves are likely to 
approximate to the unknown true distributions we may compare their values 
with the true values. This has been done in Table IV and in the diagram. The 
proximity in position between the solid and dotted circles in the latter, suggests 
that the Type I and Type VI curves must represent the true f\ and \jrl distributions 
very closely. 

Thus the probability levels of the Type I and Type VI curves given in Table III 
may be taken as lying not very far from the true values, and this shows that no 
great error will be made if we assume for samples of 20 and over that Neyman’s 
criterion of the first and second orders are distributed as yp with one and two 
degrees of freedom respectively. 

3. Sphebb oe useeulness of the "smooth” test 

It was pointed out in § 1 that where the deviations of the observations from the 
hypothesis tested are consecutively positive (or negative) then the "smooth” 
test for goodness of fit may be applied. It does not seem possible to give any 
definite rule as to when it might be used in preference to the x® test. Neyman 
remarks that it would be interesting to enquire into the relative workings of the , 
and rjr^ tests, possibly with this point in mind, but nothing has been written 
on the subject and it is difficult to see how a comparison of the powers of the two 
tests could be obtained. 

While recognizing the ingenuity of Neyman’s “ smooth ” test, and the attempt 
it makes usefully to supplement the test, the present writer feels that as it 
stands it is not wholly satisfactory, and that much work remains to be done before 
it can pass into common use. It has been pointed out in § 1 that Neyman (1937) 
and E. S. Pearson (1938) have both discussed the appropriate order of test to 
choose, without however reaching any definite conclusion. It has been possible 
to remedy this in part by confirming E . S . Pearson’s suggestions through numerical 
calculations, but an extension of the theory appears necessary before the test is 
generally applicable. 

In § 2 it has been shown that serious error will not be made if the use of the 
"smooth” test is restricted to samples of twenty and over, and Neyman himself 
says that he would use the test only if the available sample is large. At present, 
however, the application of the test presents rather a formidable task for, any 
computer. Por example, if the sample is of size 100, and it is desired to apply a 
test of the second order, the numerical work would entail 100 entries into tables 
of the appropriate probability integral with possible interpolations and 100 
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substitutions into each of the first and second polynomials. Hence for a sample 
greater in size than 100 the labour of computation involved will incline the 
computer towards the simpler and more familiar f®st. The need for such a 
“smooth” test cannot be denied, but it would have greater utility if it could be 
applied after some grouping of the observations has been made. Whether this is 
a possible development or nqt I do not know. 

At present it is only possible to apply the “ smooth ” test when the parameters 
of the hypothesis tested are known, whereas the test permits one to calculate 
them from the data, as for example in the fitting of Pearson curves. What the 
effect of calculating parameters from the data would be on the criterion I 
cannot at present suggest, although it is hoped to throw light on this point by 
means of a sampling experiment. Certainly if parameters are calculated from the 
data and used in the basic hypothesis, Sq, it would mean that the variables y 
would be no longer independent or rectangularly distributed. Whether it is 
possible to determine what form the probability law of y would take, given 
restrictions on the original sample, remains to be investigated, 


I have to thank Prof. E. S. Pearson for helpful criticism and Miss J. Townend 
for drawing the figure. 
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TESTS OF HYPOTHESES CONCEENING 
LOCATION AND SCALE PARAMETERS 


By E.J.G. PITMAN 

University of Tasmania 

1. The gomparisoh oi location parameters 
Suppose that h observed numbers 

are values of k chance variables, and that the elementary probability function of 
the simultaneous distribution of the chance variables is 

the function F being of known form but the values of tbe location parameters 
fti, being unknown. Suppose, further, that we wish to test the hypothesis 

^l^at % = 

which we shall call the hypothesis II q. Any test must be based upon some statistic 
J which is a funct)ion of the observed values x^, a:*, and Hq will be accepted 
for certain values of J and rejected for all other values. If the observations are our 
only source of knowledge of the values of the location parameters, a satisfactory 
test must give the same answer when the observed values are 

Xi+A,x^+X,...,x,,+X, 
as when, they are x^, x^. 

Hence the statistic J must have the property 

J(xi+A, ..., Xj,+X)^J{x^, x^). (1) 

Without loss of generality, we may assume that J is always positive, and that 
a small value of J is regarded as significant, i.e. as indicating that the hypothesis 
is untrue. If this is not so, we simply replace J by some suitable function of it. 
In order to use the observed value of J as a test of Hp, we must know the distribu- 
tion of J when i?o is true. Since J has the property (1 ), its value will be unaffected 
by the same change of origin of all the x. Hence, when the a are all equal, we may, 
without loss of generality, assume that their common value is 0. We require then 
the distribution of J when 

Let p be any given number (such as 0-95) between 0 and 1. When this distribution 
of J is known, we can determine 6 snch that, when Ho is true, the probability that 
J is f. Having chosen p, we reject the hypothesis if the observed value of 
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J is less than 6, and accept it otherwise. The probability of rejecting when it is 
true will be 1 ~p. If J is so chosen that for a fixed value of 6 the probability that 
J S is greatest when Hq is true, the hypothesis Hq will be more likely to be ac- 
cepted when it is true than when any alternative hypothesis is true. A test which 
has this property is said to be “unbiased” (Neyman & Pearson, 1936, p. 8). Let 
us attempt to determine J so that the resulting test is unbiased. 

A set of values of the x may be specified by a point (the sample point) whose 
rectangular co-ordinates in a A; dimensional space are {Xp x^, ..., x^.). For the co- 
ordinates of a variable point in this space we shall use (^p . .., ^j,). Since J has 

the property (1), it will be Constant along any line parallel to the line 

= = ( 2 ) 

and therefore the region A , consisting of all points for which J'^6, will be bounded 
by a cylindrical hypersurface with its generators parallel to the line (2). When 
Hq is true, the probability that J>6is 


P 




k> 


and when H(, is not true, the probability is 




where 5 is a region formed from A by translation without rotation. Thus, the 
necessary and sufficient condition for an unbiased test is that the integral of 
F{^P over the region A be greater than its integral over any equal and 
parallel region B. 

If L is any line parallel to the line (2), we shall write 

where fj = 

the distance of the point ..., in) from the plane Si = 0, and the integral is 
taken along L. Now if A is defined as the locus of all lines L for which P(L) is 
greater than or equal to some constant h, P{L) will be less than Ji on any line L 
in B which is not also in A . From this it easily follows that 

f F(ip...,in)di,...din>{ ...,y 

Ja Je 

and so the resulting test is unbiased. Hence we shall have an unbiased test if we 
define J at any point as equal to P{L) for the line L through that point. 
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The co-ordinates of a variable point on the line L through the point 

(ajj, *21 

may be expressed in the form 

(r = 1,2, 


Ex 

where ^~~k 

The expression for P{L) becomes 


1 

#■ 


P{L)=^hj F(xi-t,...,x,t.-i)dt. 

Instead of P{L) we shall take P{L)^h for J, and, replacing the symbol t by the 
symbol a, we have 

f co 

P(Xi-a, ..,,Xf^-a)da. 


The test based on this J will be unbiased. 

As a simple illustration, take the case where the x are independent normal 
variables, each with unit standard deviation, and the mean value of is a^. 


F{x^ - . . . . - a^) = exp { - ^Eix, - 

1 r“ 

therefore J = ^^1 _ ^ exp { - iE{x - «)»} da 

e-is 

“fcl(2V)W’ 

where S = E{x^-x)^, x = Ex^k. 

In practice we would take S, which is a monotonic function of J, as our criterion, 
large values of S being significant. When Hq is true, 8 is distributed like with 
k~l degrees of freedom. 

Suppose now that the hypothesis to be tested asserts not merely that the 
location parameters of the chance variables * are all equal, but that these location 
parameters all have the same particular given value, By a change of origin of all 
the * this particular value may be taken as 0. Thus the hypothesis to be tested is 
that „ „ n 

which we shall call the hypothesis H'q. It is obvious that if we take 

J' = F(xi,...,x^) 

and regard small values of J' as significant, the test will be unbiased. Moreover, it 
is very easy to show that, if the x are independent, so that F is the product of the 
separate elementary probability functions, and if each elementary probability 
function is unimodal, then, when is not true, the probability that 
increases as any a moves towards the value 0. 
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2, The comparison oe scale parameters 
Here the elementary probability function of the chance variables x is 

where the c are all positive and we wish to test the hypothesis, 


^1 — ^2 ^ k > 

which we shall call the hypothesis Hi. Let us assume for the moment that the x 
are all positive chance variables. Putting 

y, = logx^, a^ = logc„ 

we have for the elementary probability function of the chance variables y 

exp - Ta,) . i’{exp (y^ - a ^), . exp - o J}. 

The hypothesis applied to the parameters c is equivalent to Iff, applied to the 
parameters a. We shall therefore have an unbiased test if we take as our criterion 


J = 



exp (3j, - ka) . F{exp (y^ - a), , exp (y^, - a)] da 


= n {«r} f 


gfc+1 


dc. 


The expression for J in the general case where the x are not necessarily positive 
chance variables is 


J^n{\x,\} 


'F{xJc,...,Xklo} 




dc. 


Small values of this are significant. The axial planes divide the k dimensional 
space into 2*^ regions, and the probability of the sample point falling in any one of 
these regions is independent of the particular values of the parameters c. By 
considering these possibilities separately and using the result for the case of all 
positive chance variables, it is easy to see that the test is still unbiased. J is a 
homogeneous function of degree 0 in the a:; hence, when is true, its distribution 
is independent of the particular common value of the scale parameters c. Therefore 
in determining this distribution we may' take the common value to be 1. 

From the corresponding result of § 1, it follows that an unbiased test of the 
hypothesis H[, that 

Cl — ^2 = ... = Cj. = 1, 

will be obtained by using 


small values of which are significant. Again, from the corresponding result of § 1 
it follows that if the x are independent positive chance variables with elementary 
probability functions c" Vr(^r/Cr) s'lch that xf^{x) is a unimodal function of x, then, 
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when H'l is not true, the probability that J'^0 increases as any c moves towards 
the value 1. 

Tests of other hypotheses will be discussed in a later paper. The remainder of 
this paper will be devoted to a discussion of the most important application of the 
results of this section. 

The problem of deciding whether k samples, each known to have been drawn 
from a normal population, have been drawn from populations with the same 
standard deviation has been discussed by Neyman & Pearson (1931), who have 
proposed a criterion for this purpose. Applying the method of this paper, we 
obtain a test which is the same as that proposed by Bartlett (1937, p. 273). 
Bartlett’s test is therefore unbiased. It can be shown that the Neyman-Pearson 
test is unbiased only when the number in each sample is the same, in which case 
it is exactly equivalent to Bartlett’s test. 

A continuous chance variable which takes values from 0 to co with elementary 
probability function 1 

r(mf * 

will be called a r{m) variable. If its elementary probability function has the more 
general form 2 




I 

cPim) \c/ ’ 

we shall call it an unsealed r{m) variable with scale parameter c. If 

yvVz) 

are n numbers with mean y, we shall call the expression 

S=hjr-W 

1 

their squariance.'f We know that if the y are a sample of n values from a normal 
population of standard deviation cr, Sjer'^ is distributed like ')^ with n—l degrees 
of freedom, or, what is more convenient here, has a T{\{n - 1)} distribution, 

so that 8 is an unsealed r[^(n- 1)} variable with scale parameter 2cr®. Moreover, 
in this case, if the mean of the population is unknown, the whole of the information 
about O' supplied by the sample is contained in the value of 8. If the mean of the 
population has the known value a, 


^{Vr-oP 

1 

is a sufficient statistic for the estimation of o'; it is an unsealed r{\n) variable with 
scale parameter 20 "^. Hence questions about variances of normal populations can 

t If this word is objected to, I hope that someone will coin a better, for it is very convenient to 
have a name for this expression. The usual “sum of squares” is not precise in meaning a.nd is 
awkward in use. As I have pointed out previously (Pitman, 1937, p. 215), it is the squarianoe of a 
sample of n that appears naturally in the mathematical theory, and not the variance 8jn, or the 
“estimated variance” 8j{n—l). The last, however, becomes prominent in the applications. 
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be reduced to questions about scale parameters of unsealed F variables, In 
particular, if we have h samples with squariaiices 

and if each sample has been drawn from a normal population of unknown mean 
and variance, the question "Are the variances of the normal populations all 
equal? ” is equivalent to “Are the scale parameters of the unsealed P variables 

all equal? ” The problem of answering this latter question we now proceed to 
consider. 

3. Application to gamiu variables 

Suppose that 

are k observed numbers, that is a value of an unsealed r(m,.) variable with scale 
parameter c,,, and that the k chance variables are independent. We wish to test the 
hypothesis Hu that . _ „ _ _ ^ 

Put M = Sm. 


Except where explicitly stated otherwise, all summations 2 and all products H 
are to be taken over the h values of the operand, which will be written without 
an index. Thus /j * 

Sm = S m,., II {a:™} = J] 

r=l r=l 

The elementary probability function of the distribution of the x is 


Ijfence 




J = 


n{cr{m)} 


n{x^} 

n{r(m)} 


i: 


g-(A®)/« 


do 


r{M] n{x”^} 
ri{r{m)}{Sxf 


It is sometimes more convenient to deal with 


(Ixf’ 

which differs from J only by a constant factor. The maximum value of K is 

n{m^} 


and we shall denote the logarithm of the ratio of this maximum to the value of K 
by L, so that 

A = JIf log (FxjM) - S{m log {xjm)}: 

It is necessarily positive or zero, and large values are significant. 
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For testing the hypothesis that 

= C 2 “ • • » " " t ) 

, , , . „ n{x^}er^^ 

the appropriate criterion is J ~ ' 

We shall write K' = e-^^n{x% 

L' = log (/ C„^ ) - log Jf ' =» la; - if - Z'{m log (a:/m)}. 

4. The msteibtjtion of ^ when E-^ is tbue 

As explained above, when is true we may take the common value of the c 
to be 1. The probability that Z g /c is 


over the region where 


zg^»}> 

{Ixf^ 


Make the change of variables 

Sx = u, 

x, = uy^, {r=l,2,...,k-]) 

it-i 

and write for convenience y* = 1 - S 2/r> 

i 

so that Sy = I and X/^ ~ uyji. 

Then integrating with respect to u, which is not involved in the boundary condi- 
tion, we obtain for the value of p 

r{M) 


n{r(m)}, 


dy^... dy^^^ over n{y^} g k. 


As the maximum value of K is we are concerned only with values of 

K less than this . It should also be noted that in the applications to normal variables 
only integral and half-integral values of m^, occur. 

When K = 0,p - 1, and the region of integration is 
OSyrSl. (r=l,2,...,/c-l) 

ft-i 

OS 


Therefore J E{y^~^} %i . . . dy^_i 

over this region is equal to 


EjEH} 

r(M) ’ 

which may be denoted by B{my,m 2 , It is the extension of the complete 
beta function, 
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When h = %, the expression for p is 

1 

where a;f i( 1 - = a:f i(l - - k. 

In the tabulations in § 5 for A = 2, the value of x^^ is given. 

When fc = 3, 

over yf’-yf^{l—yi—y^)'^3'^K. 

By the substitution y^ = uv (O^wgl), 

y^ = u{\-~v) (O^vgl), 

this becomes 

1 r 

P — I _ ji 1 (3) 

over 1 — u)^» «’"!( 1 — 1))”*2 ^ K. 

If mi = Wa = m3 = m, and m is integral or half-integral, this is expressible in 
terms of complete elliptic integrals; but even when m = 1 the reduction to 
standard form is tedious and I have not carried it out. However, Nair (1939) has 
shown how to obtain the exact distribution in the general case 


mi = m2 = . . . = m^.. 

It is of some interest to determine the exact distribution in other cases in 
order to check the accuracy of the approximate distribution discussed in § 5. 
A case which happens to be fairly easy to handle is 


mi = mj = m, = 2m. 


From (3) 


Putting 


we obtain 


^ j3(m,m,27a)J 

u^{l — ufv(\—v) ^ xV”*. 


2®“~*B(m 




over 


( 1 — x^)^ ( 1 — ^ 64x^/”‘, 
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which is fairly easily expressible in terras of complete elliptic integrals when m 
is small and integral or half-integral. When m= 1, so that 

= 1, mg = 2, 

P = ?g^{l--x^)dxdy over (l-a;2)a^l-y2)^64K 
^J{{l-x^f-HK)dx over (l-a;®)^>64«' 


where 

This redircos to 


^ x^) [IP ~ x^)] dx, 

I 

a‘^ ■■= 1 ~S..Jk, b^=l + S^jK. 


= 6 j* ,/(! - siu^^) d(j) — 86 


dx 


( 62 -* 2 )} 
d(j> 


lo 

where i » a/6. In the table in § 5 the value of ^ = arc sin t is given in order that the 
results may more readily be checked from tables of the complete elliptic integrals. 


6, The semi-invariants on L and the approximate distribution 
OP L WHEN Hi is true 

We shall denote the sth semi-invariant of a chance variable Z by If x 
is a Am) variable, A,(log») = (?», 

where Gg{m) is the sth derivative with respect to m of 

G(m) = log r(m). 

From this we can easily show that the semi-invariants of 
L = MlQg(I^x/M)-Z{mlog(x/m)} 
a-r® Ai(I') = M(Gi(M)-logM}--Ji{m[Gi(m)--logm]l, 

s > 1, A, (I) - ( - )^ 

Using the asymptotic expansion of GJm) and neglecting terms of order l/m^, we 


obtain 

A.(i*)-VUw. 

where 

v = ~. 


l-ta 




Hence L* is approximately a r{i(ls - 1 )} variable. 2L* is distributed approximately 
like with fc - 1 degrees of freedom, which is Bartlett's result (1937, p. 274), with 
a different notation. 
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The corresponding result used for testing H[ is that, when is true, 


is distributed approximately like with k degrees of freedom, where 

In the tables below, P is the true probability of obtaining, when Si is true, a 
value of 2L* as great as or greater than that shown. P' is the approximate prob- 
ability calculated from the approximate distribution q£2L*. Although in practice 
we are concerned only with the upper tail of the distribution of 2L*, it is of some 
interest to see how the approximate distribution as a whole compares with the 
true distribution.! It will be noted that the approximation improves as a 
decreases. The meanings of the symbols and d are explained in § 4. 

TABLE I 


Comparing true and approximate probability integrals 





1, = i 


mm 


B 

B 


2L* 

p 

P' 

0-42 

0-01729 



0-475 

0-00401 

0-96 


0'35 

0-06287 

0-8060 


0-46 

0-01608 

0-90 


0'27 

0-15860 

0-6957 


0-35 

0-15090 

0-70 

0-6977 

OdS 

0-44890 

0-5064 

0-6029 

0-26 

0-46029 



0-06 

1-10715 

0-2871 

0-2927 

0-16 

1-07735 



O'Ol 

2-15262 

0-1276 

0-1423 

0-05 

2-65717 


0-1031 1 


3-68164 

0-0403 

BTorniB 

0-026 

3-72464 





0-0127 


0-005 

6-26726 



■■ 

6-.32320 

0-0056 

0-0119 

0-0025 

7-37228 


■jl 



mi = l , m2= 

II 



OTi = l , mj= 

II 

s’ 



2L* 

P 

p' 

*1 

2L* 

p 

p' 

0-31 

0-00631 

0-9371 

BB 

0-91 

0-00739 

0-9318 

0-9315 

0-30 

0-01303 



0-87 

0-01688 


0-8997 

0-22 

0-16997 

0-6817 




0-7561 

0-7553 

0-16 

0-45502 


0-5000 



0-4625 

0-4616 

0-096 

1-07829 

0-2997 



1-38761 

0-2385 

0-2388 

0-035 

2-63527 

0-1096 


nsR^I 

2-40444 

0-1196 

0-1210 

0-016 

3-78064 

0-0496 



. 3-87237 


0-0491 

0-004 

6-06129 



0-01 

6-19744 


0-0128 

0-002 

7-21619 

B 



7-76793 

Ib 

0-0053 


f See Bishop & Nair (1939), where the lesiilts of a very thorough investigation of the upper 
tail of the distribution are given. The computations of the present paper were completed before 
the publication of Bishop & Nair’s paper. 
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TABLE I (cont.) 






mi=mj=l, 7»3=2, a=i 

k 



p 

F 

e 

21* 

P 

P' 

0-49 

0-003809 

0-95081 

0-96079 

70 

0-10005 

0-9617 

0-9612 

048 

0-01S252 

0-90177 

0-90171 

10“ 

0-20321 

0-9043 

0-9034 

043 

0-188518 

0-66432 

0-66415 

19“ 

0-71677 

0-7004 

0-6988 

0-39 

0-472482 

0-49205 

0-49185 

27“ 

1-40868 

0-4962 

0-4944 

0-31 

1-486259 

0-22306 

0-22295 

36“ 

2-42738 

0-2968 

0-2971 

0-25 

2-739828 

0-09785 

0-09788 

60“ 

4-53262 

0-1001 

0-1037 

0-21 

3-905482 

0-04806 

0-04813 

57“ 

6-88759 

0-0493 

0-0627 

044 

6-958488 

0-00828 

0-00834 

69“ 

9-02451 

0-0093 

0-0110 

041 

8‘928707 

0-00277 

0-00281 

76“ 

11-32688 

0-0027 

0-0036 


6. Applications 

Suppose that we have h samples, each drawn from a normal population of 
unknown mean and variance, and that we wish to test the hypothesis that the 
variances of the normal populations are all the same. Denote the number of 
members of the rth sample by 1, and the squariance of this sample by S^', 
then Sf is an unsealed variable, and the hypothesis to be tested is equivalent 

to the hypothesis applied to the variables Sf. Thus the appropriate criterion 
isf 

L = iJlog {SSm-mnhg {SIM, 

where N = S%. 


A large value of L is significant. 

2L = A log {ESIN) - E{% log {Sjn)}, 


2L* = 


2L 
1 +a’ 


where 


.1 i^l n E{lln)-1IN 

~6(^-i)riw ijvr 3(k~i) 


When is true, 2L* is distributed approximately like with h-l degrees of 
freedom.^ Erom the tabulations in §5 it is evident that even with very small 
samples the approximation is sufficiently good for us to use it to determine 


t The difference between this expression, L, and the function as defined by Neyman & 
Pearson should be noted. The relation is referred to on p. 212 below as well as being disousBed by 
Bishop & Nair (1939). 

t Cf. Bartlett (1937, p. 274). 
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whether an observed value of L is significant or not. It may be noted that if we 

= S^l\, 


put 


the “estimated variance” derived from sample r, the expression for 2L is 

2L = N\ogv~S{nhgv}, 
where v = Inv/N. 

Hence L has its minimum value 0 when 


and dL/dVy has the same sign as - v, and therefore L increases as the v diverge 
more and more from a common value. Tims the L test may be regarded as answer- 
ing the question "Are the estimated variances «i, significantly different ? ” 

If we wish to test the hypothesis that the variance of each population is 
unity, we obtain the appropriate criterion from the fact that this hypothesis is 
equivalent to the hypothesis H' applied to the variables The criterion is 
therefore i' ^ jog US /In)}, 

and we have 2L' = 2S-N-Z{nhg (S/n)}, 


2L* = 


2L' 

1+a’ 


a = 


E/l/n) 
3k ■ 


When H' is true, 2L* is distributed approximately like y® with k degrees of freedom. 

It should be noted that the field of application of the L test is quite different 
from that of Fisher’s z test, which is used in the analysis of variance. When, in the 
analysis of variance, we test for significance of treatment differences or significance 
of an interaction, the question we put is not “Are these several estimated variances 
significantly different?” but “Is this particular estimate, or are any of this 
particular group, significantly greater than that particular estimated variance 
(residual variance) ? ” The truth is that, when engaged in the analysis of variance, 
we are interested not in variances of normal poj)ulations but in means of popula- 
tions. A straightforward treatment of an analysis of variance problem by the 
method of likelihood leads to a criterion function which, when the null hypothesis 
is true (treatments without real differences of effect, or no interaction, as the case 
may be), is certainly the ratio of two estimated variances ; but the question raised 
is not “Is this ratio significantly different from 1 ? ” but “Is this ratio significantly 
greater than 1 ? ’’f We are led to Fisher’s z test or an equivalent, and not to the L 
test. It is true that in the analysis of variance we have certain squariances 8-^^, 
/Sg, ... which, when the null hypothesis is true, are unsealed gamma variables 
with equal scale parameters, and so the corresponding L will have the distribution 
discussed above; but this would not justify the use of the value of X as a test for 
the null hypothesis. In devising a satisfactory test we must consider the state of 

t That being so, it may he remarked in passing, to speak of a “negative interaction” in a way 
which implies that it has some significance is a breach of statistical good manners, it is questioning 
the referee’s decision after having agreed to accept it as final. 
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affairs not only when the hypothesis to be tested is true but also when it is false, 
In this case, when the null hypothesis is false, some at least of S^, ... are not 

unsealed gamma variables but have a different distribution, and so the proof that 
the test is unbiased no longer holds. When used outside the field in which it 
naturally arises the L test loses its theoretical satisfactoriness. If, for example, 
the totals for the various treatments were identical, the L test applied to variance 
due to treatments and residual variance would judge this significant, i.e. as 
indicating that the treatments produce real differences in effect, while the z test 
would rightly judge the result as non-significant. Improbable as sucli a result is, 
it is more probable when the null hypothesis is true than when it is false. It would 
seem that the L test must lose power by judging as significant results in which 
the variance due to treatments is small compared with the residual variance, and 
that the z test must be more powerful in this field than the L test, i.e. more likely 
to reject the null hypothesis when it is false. Wishart (1938) proposes to use the L 
test as a supplementary test in experiments of factorial design when the z test 
has given a non-aignificant result for treatments as a whole, but it seeina unlikely 
that this combination of the tests will be as powerful as the z test used alone. 


7 . The Neyman-Peabson test 


The Neyman-Pearson criterion function for testing the hypothesis that the fc 
normal samples have been drawn from populations with the same variance is 
(in our notation) 

“ {i;^/(iV'+^;)}K^+*) 

small values of this being significant. Remembering that 8^ is an unsealed 
r[\nf) variable, we see that this is equivalent to using the criterion function 




instead of 


jr n{x^} 
~ {Sx)^ 


for testing the hypothesis applied to the variables x^, x^, . . ., of § 2- 

Denoting by p' the probability that K' > k', we can easily show that when 


h - Cg 


Cj, _ c, 


dp' 

Scj 


over the region 


c/7{r(m)}. 


( 2 ; 2 ) M +*/2 = 


t Neyman & Pearson have denoted by L^. 


(4) 
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From Consideration of the derivative with respect to % of the function on 
the left-hand side of (4) it is obvious that a line parallel to the % axis, which 
cuts the boundary of the conical region (4), cuts it in two points and all points 
between these lie within the region (4). Let (^<C) b® fhe co-ordinates 
of these points— they will be functions of z^, , z^.- Consider 

J z^e~^^n{z'^-~^dzi. 

Integrating by parts, we obtain 


- er^% 




n[z^) 

since when Zj = ^ or the equality in (4) is satisfied. Hence 






r-e-^*zi(2’z)^-«xnr 


i7{zi} 

“ ' Jt m 

Thus, when = Cj = . . . = c^. = c, we have 




f 

1 - 


ilf-l-p, 1 


hc^\dZy 


Zz 2«i 


Sc' ci7{r(m)}J il{z8} Tz ^2zir^‘" 

over the region (4). By the method of § 4 we obtain 


ap' xT(if) f 1-^yi, , 

dc^''2cn{r(m)]!nn{y^^y^'’- 

over the region R determined by 

where Zy = 1 &s before. A necessary condition for the K' test to be unbiased is 

when CjL = C 2 = ... = Cjs;, 

and therefore for an unbiased test we must have 


It is obvious from symmetry and from the fact that Zy = 1 that (6) is true when 

= m 2 = ... = Wj, = m. 
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In fact the K' test is then equivalent to the K test since 

and the K' test is therefore in that case unbiased. If the m are not all equal, 
suppose m-y > then am, +5 | jmi+i Qiih+i 

according as a=h. 

Hence if P is any point in R with its co-ordinate less than its co-ordinate, the 
point P' obtained by interchanging the y^ and y^ co-ordinates of P will also lie 
in R, but if P is on the boundary of R (or sufbciently near it and inside R), and has 
its ^J^ co-ordinate greater than its y^ co-ordinate, the point P' will lie outside R. 

•4 

Therefore dy^ . . . > 0, 

and so (5) cannot be true. Thus the K' test is biased when the m are not all equal. 
Considering the application to samples from normal populations, we see that the 
Neyman-Pearson test is biased unless the number in each sample is the same, in 
which case it is equivalent to the L test. 

8. The distribution oe K when //j is false 
The probability that K is greater than or equal to k is 


f- 

" /7{c«H’(m)} j® ]dxy... dxj, 

over the region where 

{i:x)^= ' 

which is equal to 

i7{r(m)}f )dz^...dZk 

over 

1J{gV}^ 

By tlie method of § 4 we obtain 


^ ~ /7{r(m)} ^ 

over 

{Zcyr = ’ 


where Py=l. When k = 2, this can be evaluated by means of the tables of the 
incjornplete beta function. It can be evaluated also in some simple cases ofk~ 3. 
All ajiproximation to the distribution of L in the general case is being investigated. 
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Summary 

Tests of certain hypotheses concerning location and scale parameters are 
developed. These tests are unbiased and are applicable to chance variables with 
any continuous distributions. An application to the comparison of variances of 
normal variables yields Bartlett’s test, which is thus shown to be unbiased. The 
Neyman-Pearson variances test is shown to be biased except when the samples 
are all of the same size. 
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NOTE 

[As one of the authors of the original te.st and of the conception of “ bias ” in connexion 
with tests of statistical hypotheses, it is perhaps permissible for me to say that Prof. Neyman 
and I had for some time realized that the teat was slightly biased when applied to samples 
of unequal size. The difference between this test and that put forward by M. S. Bartlett, 
and discussed more fully by Prof. Pitman above, may be expressed by'saying that the 
Bartlett test weights the sums of squares' with degrees of freedom whore the Aj test had 
used sample sizes as weights. Intuitionally the former weighting seems the more appro- 
priate to take but, as in certain problems of estimation with whicli the reader will probably 
be familiar, the application of the principle of maximum likelihood, a form of which Neyman 
and I had used, introduces sample sizes rather than degrees of freedom. In 1936 B. L. Welch 
[Statistical Research Memoirs, 1, 53) had suggested the use of a test function in the form of 
the ratio of a weighted geometric mean to a weighted arithmetic moan of sums of squares, 
the weights being adjustable at will; but it was Bartlett who definitely advocated the 
weighting with degrees of freedom. 

In a Ph.D. Thesis for the University of London, completed in 1937, only a part of which 
has been published (U. S. Nair, 1939), Dr Nair showed that in a particular case with k = ^ 
and sample sizes of 6, 10 and 15 the test was certainly biased. During the past winter 
he also sent me a mathematical proof showing that in the simplest case, with k = 2 , the 
Bartlett test was unbiased. Of two or three people investigating the problem, Prof. Pitman 
has however been the first to reach a complete solution. His proof is of great generality; 
not only does it cover the case of any number of samples or groups of data, but it is derived 
from a very interesting general approach not limited to the particular problem of testing 
for homogeneity of standard deviations among normally distributed variables. E. S. P,] 
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(i) On the Calculation of the Gumulants of the x-distribution 
By N. L. JOHNSON and B. L. WELCH 


In a TQoent mvestigation it has been found necessary to calculate the cumulants of the 
y-distribntion with considerable accuracy. This may be done, of course, by calculating the 
moments of % about zero and then correcting in the usual way. It was found, however, 
that a substantial saving of labour may he effected by another method which will bo indicated 
below. This method is essentially the .same as that employed by K. Pearson ( 19 1 6) to calculate 
the constants of the standard deviation distribution, being extended to give the fifth and 
sixtii cumulants. 

The diatrib\ition of y with / degrees of freedom is given by 


P(X) ~ 


The Jfcth moment of y about z(wo is 




nw+i)} 

nm ’ 


whence it is seen tliat the even moments are intogors and the odd moments integral multiples 
of /tj. The itth moment about the moan may therefore bo expressed as a polynomial of 
the fcth degree in ii{ with integral coefficients. Hence we obtain the following formulae 
for the first six cumulants: 


/Cl = ii[, 

^3 ~ ( ~ 2/+ 1) /Ci+2/c5, 

/Ci=:(-2/H2/) + (8/-4)A5-6/ct, 

. ^5 = (16f-lC/+3)/Ci+(20-4qf)A?+24/c5, 

/c, = (16f - 24/2 8^) + ( - 136/2 + 136/- 28) k \ + (240/- 120) /rj - 120/c5. 

By substituting (/-/Cj) for /cj we may now obtain formulae for /Cj, /Cj, /Cj and /c, involving 
only the first power of and powers of /Cj up to the third. Thus 


^8 — ^l(I ” 2/Cj), 

/C 4 = - 2/(1 - 2/Cj) + 4/Cj - 6 /c|, 

- ^i[d/{l - 2 kj) + 3 - 20/C3 + 24/c^], 

/C5 = - 8f (1 - 2/C3) ~/(20 - 1 04/C2 + 120^2) +' (28/Cs - 120^1 + 120 a?). 

The facts that the multiplier of the highest power of/ is always a multiple of ( 1 ~ 2 aj) and that 
Aj tends to J as/ tends to infinity, suggest that some simplification may result by expressing 
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these cumulants in terms of powers of ( 1 — S/Tj) instead of powers of k.^. Putting ( 1 — 2k^) = a 
we find 

Ka = /Cl a, 

/Cj = /Ci[ — 1 + (4/— 2) a+ 6<x*] 

= /Ci[-2/C4 + 3aa], 

= (2/- 1) - (8f - 8/- 1) a - 16(2/- 1) 

= 2(2/-l)/f4 + 3a-12(2/-l)a2_i5a3, 

The calculation of the higher cumulants by means of tlieso formulae is rendered very easy 
once /Cj is known. For/ small Ki may.be calculated from the following formulae: 


/ even /Cj 


\/ 2 (/-2)(/-4)...2 ’ 


/odd 


/2 (/-l)(/-3)...4-2 
V w{/-2)(/-4)...3.l‘ 


(To 16 decimal places 



0-79788 48608 02865, 


and 



1-263314137316600.) 


For / large Stirling’s expansion for the factorial may be used, giving 


log/Ci-ilog/-i + ~-^^+^+... 


_ , C 1 1 5 21 399 869 ) 

'='1 - V/ 1 1 4_^+ 32/2 + 128/3 “ 2048/« “ 8192/® ■^ 65636/® ‘ ' J ‘ 

The above formulae allow the calculation of the cumulants very quickly with as much 
accuracy as is ever likely to be required, but the following short tables, giving only a few 
figures, will be useful for many practical purposes. Note that in the second table the 
cumulants are multiplied by powers of/ in order to increase the accuracy of harmonic inter- 
polation. 


/ 


Ka 


*^4 

ica 

«e 

1 

0-797886 

0-363380 

0-218014 

0-114771 

-0-004438 

-0-152659 

2 

1-253314 

0-429204 

0-177460 

0-046149 

-0-037791 

-0-068662 

3 

1-696769 

0-463521 

0-148340 

0-022247 

-0-029635 

-0-029176 

4 

1-879971 

0-466708 

0-128935 

0-012860 

-0-021826 

-0-014166 

5 

2-127692 

0-472926 

0-116210 

0-008271 

-0-016482 

-0-007712 

6 

2-349964 

0-477669 

0-104963 

0-006730 

-0-012867 

-0-004696 

7 

2-653231 

0-481014 

0-096954 

0-004189 

-0-010346 

-0-002935 

8 

2-741625 

0-483494 

0-090506 

0-003190 

-0-008626 

-0-001978 
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/ 

24// 


P“ 

^2 



KiWff 


6 

4 


0'477669 

0-2.57082 

0-2()627 

-0-18910 

-0-9927 

8 

3 

(H)693ll 

0'483494 

0-26.5989 

0-20413 

-0-19291 

- 1-1)128 

12 

2 

0-97!)4()6 

0'489l7(i 

0-264427 

()-2((02.5 

-0- 1939(5 

-1-0114 

24 

1 

0'989640 

049468(5 

0-252418 

0-10464 

-0-19211 

-0-9859 

CO 

0 

l-OOOOUIJ 

0-600000 

0-260000 


-0-18760 

-0-9375 


REFERENCE 

Peaeson, Kael (1916). Appendix to Papera by “Student” and R. A. Fisher, Editorial. 
Biornetrika, 10 , 522-9. 


(ii) Note on Discriminant Functions 

By B. L. welch, Ph.D. 


Recent years have scon a 'widespread development of the methods of multivariate analysis. 
Of the problems which have been di.scusscd one of the .siinple.st is that of apportioning 
individuals with .several measured charact{!rs into one or otlun’ of two completely spcffilfied 
population group.s, Thus if .Vj, ..., are the q characters monsiired upon each individual, 
and J7i and /ifj denote the two po.ssiblo populations, wo .suppose known the probability 
distributions = p(xi,Xp !!{) and Pj = p(xi,a’ 2 , ...,a;, | i'/j). In particular it is fre- 
quently assumed that IJi and JTj have the same set of variance.s and covarianco.s and differ 
only in the mean values of tiio 'cJiaracter.s. In this case R. A. Fisher (193C) has considered 
the problem of choosing the best linear function X of .Ti, i’jj, ...,;b^ to form the basis of 
clflssilication of the individuals. The solution was obtained by maximizing the absolute 
value of the ratio of E[X | Uj} — E{X j U^) to the standard deviation of X. This is certainly 
the best discriminant function of any kind, whether linear or not, provided and Pj are of the 
multivariate normal form. . 

However, without making any assumptions of normality or equality of variances and 
covariances, the problem of obtaining tho bc.st function to discriminate between two 
completely specified populations may still be solved. The function is simply the ratio of the 
two probability laws Pi/pj, This is almost .solf-ovidont, but the following demomstrations 
may be useful. 

&uppo.sG in the first instance that it is possible to asse.ss a priori probabilities Wj and Wj 
that an individual will belong to or 77j respectively. This was possible, for instance, 
m tho problem of sexing human mandibles, discns.sed by E, S. Martin { 193(5) . Then, if measure- 
ments arc made on the individual, tlio « posteriori probabilities that it will 

belong to III or will be WiPi/fWj^Pi + Wjpj) and WjPs/CWiPi-t- WjPj), respectively. Now 
If It IS equally important that an individual really belonging to should not be classified 
as belonging to and vice versa, then we should assign it to Hi provided the a posteriori 
probability (OiPil{miPi + lo^p^) is greater than J. Any other assignation would increase, 
the overall chance of misolassifications. We should therefore classify into J7i when 


Pi Wi‘ 


( 1 ) 


Whatever the prior probabilities the discriminant function to be calculated from the 
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observed measurements is therefore the ratio of the probability laws; the criterion level to 
which this function is to be referred does, however, depend on the prior probabilities. 

If it is not possible or even appropriate to assess prior probabilities then the determination 
of a rule of classification must be made to depend on other conditions. For instance, it 
may be required (a) that if an individual really belongs to ZTi there should be the same chance 
of it being misclassifled as there would be if it belonged to II ^ and (6) that this common 

chance should be a minimum'. If be represented by a point in ^[-dimensional 

space, any rule of classification involves the choice of a region It in this space such that an 
individual falling into R will be classified into The above conditions (a) and (b) may then 
be written 

/ p,dT = 1- [ PidT, (2) 

Jn Jn 

and I pji dr to be a minimum. (3) 

JR 

The problem is therefore to minimize / p^dr subject to the condition that / {Pt+Pi)dT 

JR JR 

is equal to unity. A straightforward application of the calculus of variations shows that B 
consists of points such that 

pjpx>h, (4) 

where k is chosen to satisfy (2). The discriminant function is the same as before although the 
criterion level will in general be different. 

Whatever the conditions which are imposed, provided they are formulated on a prob- 
ability basis, it would appear that wo shall be led to the same discriminant function. To take 
a further instance we may mention that in discussing the theory of testing statistical 
hypothesis, J. Neyman & E. S. Pearson (1932) have considered the problem of testing the 
hypothesis that a sample comes from a certain population when there is only one possible 
alternative population. They imposed the conditions (a) that the chance of rejection when 
the hypothesis is true should be a small specified probability e, and (6) that the chance of 
rejection when the alternative hypothesis is true should be a maximum. The solution 
depended only on the ratio of the probability laws of the sample on the two hypotheses. 
This ratio has been termed by Neyman & Pearson a “likelihood ratio" and sometimes 
a “ test criterion ” . It seems to me that “ discriminant function ” is a better term for this ratio 
and that “criterion” is best reserved for the particular value of the function which is con- 
sidered critical. This value will, as we have seen, depend on the manner in which the problem 
of deciding between the two populations is formulated. 

When the populations iTj and iTj are normal multivariate, differing only in their mean 
values, it is easy to see that the ratio Pi/pj. leads to the same linear function of a:, 

as is obtained by maximizing {B(X \ iTj - E{X | IIt)yicr\. For if 

p{wx,xx, I J7,) = (« = 1.2), 

( 6 ) 

where W is the determinant of the covariances and are the elements of the reciprocal 
matrix, then 

log^ = -i 2 ; lf«{(a!,-fin)(a;,-(?«)-(!®,-0«)(!«,-fi«)} 

Pi i,j 

hi ili 

X - djx) linear function as is obtained from the other approach (R. A. 

i,i 

Fisher, 1936). 



220 


Miscellanea 


REFERENCES 

Fisheb, R. a. (1936). “The use of multiple measurements in taxonomic problems." 
Am. Eugen,, Land., 7, 179 -88. 

Martin, E. S. (1936). “A study of an Egyptian serias of mandiblcis, with .special reference 
to mathematical methods of aexing.” Biometrika, 28, 149-72. 

Neyman, J. & Pearson, E. S. (1932), “On the problem of tho moat ofiicient tests of 
statistical hypotheses." Philos. Trans. A, 231, 289-337. 


(iii) Principles of Genetics. By E. W. Sinnott and L. C. Ddnn. Fourth Edition. 
London: McGraw-Hill Publishing Company, 1939. Price 215. 

This book, which has gone through three editions since 1925, neod.s no recommendation 
to geneticists. It is so good on the whole that a short review can bo mainly concerned with 
its defects. Mendel did not use red-flowered peas as stated on p. 41, but purple -flowered 
{rot-vioktt) and, if he had crossed rod and white, would probably have got a purple F^. The 
treatment of biometrical methods on pp. 138-61 is unsatisfactory, For example, the 
account of the significance of a difference of means takes no account of “Student’s” work, 
and is therefore just 31 years out-of-date. There is no need to point out tlio serious natui’e 
of the errors which may arise from this neglect. And since Bateson’s original definition of 
genetics included a study of populations, I cannot but feel that tho authors havo devoted 
too little .space to this topic. Nevertheless, as an elementary account of modorn genetics, 
the book can be strongly recommended. 

j. B. s. H. 


(iv) Corrections to formulae in papers on the moments of By J. B. S. 
Haldane. 

The following errata should be noted: 

Biometrika, 29 , 138, equations (3): 

For Xj = 8 -I- 2(1 lit - 6) (i:^ - 30& -f 120) 8-\ 

read /tj=X3=8-|-2(Uh~66)s~i-l-(^“-301:-|-120)s-*. 

Biometrika, 29 , 390, antepenultimate line: 

For X, = 3840 -i- 21,300 to-i + 249, 600to-H 69,1 60wi-» -p 2004m-'' + m~\ 
read x, = 3840 -t- 124,800m-i -f 249,600 ot-* + 69, IGOm-s + 2004m-'‘ + m~K 

Biometrika, 29 , 391, equations (1): 

Fori x,= 3840n -f 21,300Ri-p 249, 60012* -p m,U0B^ + 2004114 +.B^, 
read x, = 3840n-P 124, SOORi-P 249, eOORj-P 69, ICOUa-P 2004114 + 115 . 

And line 20: 

For +5814-04nHl8,640-492n. 
read +6814'04nH39,340-492n. 


J. B. s. H. 
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ON GENERALIZED ANALYSIS OF VARIANCE. (I) 

By P. L. HSU 

1. The Wilks-Lawhy hypothesis. Suppose that, as a result of some random 
sampling, we are in possession of j)(ni + n) quantities, where n'^p, calculated 
from the observational data. Calling these andzj-,- (i = 1, 2, r= 1,2, 
r' = 1, 2, w), we assume that in repeated samphng they follow the probability 
distribution 

/ 1 P Ml 1 P n \ 

Const. X exp -- X a^,. X S 11 dydz, (1) 

[ -^£,1=1 )•=! r=l j 

and that we have no previous knowledge of the values of the Our problem is to 

test the hypothesis Hy. 

■>]i^ = 0 for i = l,2,...,^);r=l, 2,. (i/o) 

For p = 1 the hypothesis is that to which every linear hypothesis* may be 
reduced, and the test amounts to that ordinarily employed in the analysis of 
variance. For Wi = 1 the problem calls for Hotelling’s generalized “Student’s” 
test.f 

It should be emphasized that the ?/<,., z^,. and will not usually correspond to 
the original physical observations and physical constants belonging to the sampled 
populations, but are so derived from them that (1 ) is true and that Eg is equivalent 
to some hypothesis regarding population constants that we want to test. In another 
paper we shall deal with a general class of “linear hypothesis” on multivariate 
normal means, showing how all of them can be reduced by a rotation of the sample 
space to the “ canonical ”Hq. Here we content ourselves with the following example. 

Example. Case of k samples, (i=l,2, ...,p-, v = l,2, ...,k\ i=l,2, ...,m„) 
are k samples drawn respectively from variate normal populations all of which 
have the same set of variances and covariances. Let the population means be 

(i = l,2,...,p; v=l,2,...,k). Let 

Ic _ 1 fc 

X W, = -¥, = -jjy X 

1 _ 1 _ 

Xu = “ ^ M ^ 

% (=1 M ,,=1 

~ S (Xui ~~ ^fi/) {Xjyi ~Xjy), 

{i,j=l,2,...,pi v=l,2,...,k). 

* Kolodziezozyk (1935), Tang (1938). 
t Hotelling (1931), Hsu (1938), Bose, and Roy (1938) 
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Suppose that we want to test the hypothesis; 

= ^i2 = ••• = Si* for 

Then it is known that if we call and certain linear functions of the and 
certain linear functions of the (1) will hold true and the hypothesis under test 
is equivalent to Hq. Thus reducing the problem, we have 

ni = k~l, n — M — h, 

n\ ^ n k 

S yirVjr “ S S 2 ^ijvi 

7’ssi p=l )>==1 

ni fc „ , 

r*l y=l 




In particular, if, Ic = 2 then Ui - I, and we may drop the second index of y 
and 1 ?. We have then 

ViVi = - * 12 ) (»ii - ^ia). 

n wi] wi, 

l:v^r= 2 (%(-%) S (»i2<~*<2)(%(-»J2). 

faal isai isal 


The hypothesis in the above example has been studied by Wilks (1932), while 
an attempt to test the general hypothesis Hq was made by Lawley (1938), Both 
assumed no prior knowledge of the values of the We propose to call % the 
Wilks-Lawley hypothesis. 


2. Case where the are known, generalizations of Mahalanohis's distance. 
Trom now on we shall write 

n, n 

S Hirl/jr) ^ij ~ S ^ir^Jr> 

r=ti 

Hi JJ 

^{i= 2 VirVir, 'Z 

r-l i,j=l 

(i,j = l,2,...,p). 

In order to test Hq when the are known, we calculate the likelihood ratio* 
and get _ ^ 

~ 2 log (likelihood ratio) = 2 = S, say 

i,5-l 

and decide to reject if the value of 8 exceeds some fixed constant, chosen so as 
to fix at some desired level the risk of rejecting Hq when it is- true. 

* For the definition and interpretation of the terin see Neyman & Pearson (1928). 
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Theorem 1, For arbitrary values of the the distribution of 8 is that of the sum 
of pn. non-central squares:* 

/ ® whgn ^ 

2-iJ>ni^lPn,-le-«S+lP) 2 -J (2)- 

iftfo 4*A!r(A + ^i)%)| 

In particular, if is true, then W — 0 and 8 follows the distribution with pn^ 
degrees of freedom. 

Proof, Let the sets of variables (pi,., y^, ..., y^jf), for r = 1 , 2, . . . , Wj^, be subject to 
the same linear transformation such that, calhng the new variables (%, , u^,.), 

p p 

li ^nyirViT^ (r=l,2,...,wi). 

i,i=l 

This* is possible because the matrix ||a^^|| is positive definite. Let be the same 
linear function of the rj'a as u^f is of the y's. Then 


S=-Zi:ul (3) 

i=li-=l 

and the Ui^ follow the distribution 

(27r)-i^“i exp { - 1 S E (%r- ndu. (4) 

1 i i=l r=l ) 

From (3) and (4) follows the result (2),t as we have 

p n, p n, 

E E /<?r = S “ij E Vir%r = "P- 

i=l r-1 i,j=l r=l ■ 

From (2) we get ^{8) = W+pn^^, (6) 

= W^+2{pni+2)W -{-puf^pni + 'I), 

whence <t% 8) = 4W+2pny 

If these general results are apphed to the above example of k samples, we have 

p k 

S= s E 

i,j=l i'=l 

i,i=i »=i 


say, and the distribution 
For A; = 2 we have 


CO xphoh s 

y. - J Jg \d8, 

^ [k~,4^h\r{h+y(k-i))]"'^'^- 



P 


Wlj^ + ?W2 i,j=l 


miTOj » 


wi]^ + m^ i,j=i 


* Fisher (1928), p. 669. 


t Fisher (1928), Tang (1938), pp. 138-9. 
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and the distribution 

( m Wit. dll ] 

\d8,. 


lull dh 


ntoi'^h\r{h + lp) 


The quantity is equal to 


mi + wig 


( 7 ) 


where A is Mahalanobis’s “distance”* between two populations having the same 
variances and covariances weighted according to mi and The statistic 


mi + ffig 
pmimg 



_1 

mf 


which, according to (5), is an unbiassed estimate of A , is called the statistic and 
the distribution (7) has already been obtained.f These concepts and formulae are 
completely generalized here by and the formulae (5) and (6) for the case 

/t>2. 


3. Tests suggested by Wilks and Lawley. Generalized 1 — and E\ 1 - 
From now on we shall assume complete ignorance of the values of the In the 
case p = 1 the following functions are well known: 


Jii // n, n ' 

= S { S yH E 

r~i / \r»l 
n / / ni n- > 

1-F“= I.y^r+ 

r=>l / \r“l r-1 J 

?ii in 
r=l / r=l 


Wilks, while studying the /c-saraple case.f suggested the functions U and W 
respectively as generalizations of E^ and 'In our general case these are 

defined as 


jj _ I % J 


if Wi^^j, 


( 8 ) 


and 


W = I I 


(9) 


Both are ratios of two determinants. The above definition for U based on the idea 
of a generalized E^ breaks down if n^Kp, as then | | vanishes identically, This 

difficulty can be met if we define TJ as the product of the non-identically vanishing 
roots of the determinantal equation | + b^j) \ = 0. If % > this definition 

evidently coincides with (8), 


* Mahalanobis (1936). 


t Bose (1936). 


t Wilks (1932). 



P. L. Hsu 

225 

As a generailization of E^{1 - E^)~^ we take, instead of TJW'''-, the function F 
suggested by Lawley* and defined as 

II 

(10) 

where jg the element {i, j ) of the reciprocal matrix |1 1|~^. 

If we denote by dj, dg, ...,d/^, where is the smaller of the integers p and Wj, 
the non-identically vanishing roots of the equation 



then, according to (9) and (10) and the new definition of F, we have 


hsi, 

(11) 

iF= 

i=^l 

(12) 

11 

1 

(13) 


W and V are test functions for suggested respectively by Wilks and Lawley . 
If we use W (or F), we reject Hq if W is smaller (or F is greater) than sonxe pre- 
scribed constant. In the following sections we shall study the distributions of 
U, V and IF in repeated sampling. 

4. Case when Hq is true. If is true, then, putting all the % = 0 in (1 ), we get 

the parent distribution 

const. xexpj-^S dydz. (14) 

The following theorem has been proved elsewhere, f 


Theorem 2. Let be the smaller, and the larger, of the integers p and Ui. Let 
6 ^, 6 ^,..., di^ he the non-identically vanishing roots of the determinantal equation 

|ai,.-%,^-p6y)|=0, (16) 

arranged in the order of descending magnitude : 1 ^ ^ da > • • ■ > ^ 0. Then the 

simultaneous distribution of the d's, as derived from (14), is 


ttVi 


k 

Riw- 


r^jn-pi-l^ + i) ] 

-li + i)mn-p + i)riij 


k 

n^, 

i=r 


Wa-Ji-l) / k ]i(n~P-» I h k 

n(i-^4 n n 

i-\ ) u=l 3=i+l 




(16) 


Therefore the distributions of U, W and F are those of the functions in (11), (12) 
and (13), where the d’s follow the distribution (16). 

t Hsu (1939), Fisher (1939). 


* Lawley (1938). 
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Take the particular case = 2. We have then 

whence + F= W), 

and the distribution (16) becomes 

(17) 

From (17) we easily obtain the joint distribution of U and W : 


4r(i^-i)r(n-p+i) 

Integrating with respect to W ranging from 0 to (1 - I7*)S we get the distribution 
oft/: 

iSirailarly, the distribution of W is 

1 


2B(li,n-p + l) 

Again, from (18) we find the distribution of V : 
T[n-p\\4-\) 


lFi(«-J>-i)(i-W‘)h“idW, 


4r(k-i)r{n-p+i] 


(1+ F)-Kn-P+3) 


fP 

iJ i 


t{i+n 

(s+ro* 




(19) 


dV. (20) 


Now in the particular case considered where = 2 we have either (i) p = 2, 
Ij = Mj^if or (ii) = 2, Ij = ^3if n^^p. Hence, from (18), (19) and (20) we 
get the following three pairs of distributions: 

^ 17«'^r3)(l-t/l)«-i(ii7, 2=p<fti, 


2H(wi-l,w) 

1 


2B{p-],n~p + 2,) 

1 


mv-i)(i--ui)n-p+idu, 2=ni^p, 


25(mj,m-l) 

1 

2B{p,n~p+l) 
r(%+?i-l) 
4r(ni-i)r(?i-i) 

r{n+l) 


ir('p-\)r(n-p+i) 


|fi(«-3)(i_W‘)«>-idW, 2=j)<ni, 

IF!(n-j)-r)(i_p)ji-i(i|f, 2=n^^p, 

(1 + F)~**”+^*(iF [ 2/*(«-W(l-2/)i(”i-®)ti?/, 2—p^ni, 

Jiii+Tw-r)-' 

(l^.y)-Kn-P+3)^F j*^ yKn-P+i)(l-y)UP-3)dy, 2 = ni^p. 


4U+F')12+Vr 



P. L. Hsu 


227 


The general expression for the moments of U and W can easily he deduced 
from (16), We have, from (16), 

G{k,n)\f{d-k,n)Uddi=^l, 

J 1=1 


where 


Hence 


U=i 




I h U 

n n 

i=;l jssi^l 


•* 


H d6^ — ^ -, 

i^i op2,nj 



whence lMkni,n + q,)ndB, = 

= n GU‘>^-p + k+i)r{{l^-ly^ + 2q^ + i)r^{n+2q^-p+i) 

~ i= 1 mn-p + k + 2?! + 2?2 + i) r\{l^ -l^ + i)r\{n-p+iy 
The case of k samples (see p. 221) of two variables [p = 2) has been studied at 
length by Pearson & Wilks (1933). The hypothesis and the functions V and 
W are called in their paper Hj, aiid JDj respectively. It is found there that the 
test based onH^ (^•®- cannot be regarded as an adequate test of We shall 
show in the next section that Hq is true if and only if two population constants, 
called Ai and Aj, both vanish. The disadvantage of G referred to by Pearson and 
Wilks, may be expressed by saying that it is unlikely to be able to detect the 
falsehood of Hq if only one of Aj and Ag vanishes. 

5. Simplification of the parcnk distribution function. The matrix l|ayl| is 
positive definite; hence it can be expressed as CC',* where C is some non-singular 
real matrix. Write 


Y = 

Un Vii •• 


, Z = 

hi 

*12 • 

• hn 

, G = 

Vn Vn •• 

• Vlni 


Vn • 

• Veni 


hi 

*22 • 

• hn 


'*?21 ■’122 • 

■ Vini 


2/jji Vpi ■ 

• PpUl 


hi 

hi • 

■ hn 


Vpi Vpi ■ 

• VpUi 


lkdl = ll«ull-^ = (C')-^c-h 


Thboeem3. LetGbeofrankl(fiencel^p,l4niandlisthemnkofQG’ = HfijH), 
and let A^, Ag, .... A/ be the non-vanishing roots of the determinants, eqmtion 
- 0. Let ^ 1 , 6p 0;^ be the non-identically vanishing roots of the 
determinantal equation j cty- 0(%j-b6^,.) j = 0. Then the pint distribution of the d’s, 
as derived from (1), depends upon the parameters A^, Ag, . . ., A, alone in such a way that 
instead of {!) we may regard the par end distribution as 


(27r)-iP('‘i+"’ exp ( - ^!F) exp 



* The accent denotes the transposed matrix. 
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Hence, in studying the distribution of any functions of the 0*s, such as U, V 
and W, we may replace (1) by (21). 

Lemma 1 (Hotelling).* Let A and B be any positive definite matrices of orders 
m and n respectively , and C be any real matrix of order nxn and rank 1. There exist 
two non-singular real matrices, Nj and N 2 , such that 

NiAN; = I,t N2BN; = I, NiCN^ = D, 

VAi 

where D= ^ 

° ° VA; 

and the are the non-vanishing roots of the deierminantal equation 

1CB-1G'-AA1 = 0. 

Allowing both A and B to be unit matrices in Lemma 1 we get the following; 

Corollary. Let C be any real matrix of rank 1. There exist two real orthogonal 
matrices Fj and Fj such that 

FiCF^ = D 

VAi 

where D = ^ 

•\/A/ 

and the are the non-vanishing latent roots of the matrix CG'. 

Proof of Theorem 3. We write tr A for the sum of the diagonal elements of any 
square matrix A, It is easily verified that tr(AB) = tr (BA) whenever AB is a 
square matrix. 

The expression inside the bracket in (1) may be written as 

- ^tr (CC'YY') - ^tr (CC'ZZ') +tr (CG'GY') 

= -iy^-^tr(G'YY'C)-itr(G'ZZ'G) + tr(G'GY'G). (22) 

The Af, as defined in Theorem 3, are the non-vanishing latent roots of the equation 
I GG' - A(G')~^ G~^ 1 = 0. On pre- and post-multiplying by ] G' | and j G j respec- 
tively this becomes |C'GG'C~AI| =0. Hence the A^ are the non-vanishing 
latent roots of the matrix (G'G) (G'G)'. Therefore, by the corollary to Lemma 1, 
there exist two real orthogonal matrices, F^ andFj, such that 

FiG'GFj = D = 

* 0 0 

* Hotelling (1936), pp. 326-30. 

f I and O stand for a unit matrix and a zero matrix respectively. 


( 23 ) 
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The transformation of variables 




Y = 

(G')-iriur;, 

z = (G')"^riv, 

(24) 



Wji %a 

• • • '^Ini 

% . *^12 

• Hn 

where 

U = 


'U-gn, 

J 

y _ '^21 "*^22 • • 





... 

%1 • • 

■ 


are the matrices of the new variables, has a constant Jacobian and leaves the 0’s 
invariant, because the equation (16) becomes 


+ (15') 

rii n 

where = S (i,j = 1, 

r^l r=l 

To show this, we have, remembering the orthogonality of F.^, 
||a,^||=YY'=={G')-iriUUT,C-b 
l|6yl|=ZZ' = {G')-iriVVTiG-b 
Hence the equation (15) is transformed into 

|UU'-0(UU'+VV')| = 0, 
which is another way of writing (16'). 

On the other hand, substituting (24) into (22) and remembering (23) we get 
the expression 

- i tr (UU') - i tr (VV') + tr (DU') 

■^1=1 i=l 

Replacing again the letters u and n by y and z respectively, we obtain the result. 
It may be noticed that 

y = tr (CC'GG') = tr (G'GG'C) = S A,-. (26) 

i=l 

Memarh, For the case of k samples (see Example on p. 221) Fisher (1938) has 
considered the problem of testing for the cohnearity or coplanarity of the k 
populations. It can be proved that the hjrpothesis of colinearity (or coplanarity) 
states that all except one (or two) of the A’s vanish, 

6. Behaviour of V when is not necessarily true. The Laplace transform of a 
probability density function p{ce) vanishing identically for a: < 0, viz. the integral 

1*00 

e-<^p{x) dx, 

Jo 


has the following property. 
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Lemma 2. Let (w= 1, 2, 3, ...)> and f{x) be probability density functions 
vanishing identically for a: < 0, and such that 


'CO 

lim e~‘‘^pAx)dx ~ 

ii-xoj 0 


for every a > 0. Then 


iim 

n->-<o J 


*^p{x) dx 


(26) 


pAu)dur=^ p{u)du, 
0 Jo 


Proof. The function /„(a:) = p(a:) -p,Jix) is summable in (0, oo) and, by (26), 


/'CO 


Hm e"“^/„(a;) dx = Q for every a > 0, 


3 J 0 


(27) 


The function 


£/n(2) 




^fn{x)dx 


is an analytic function of z, regular at every a with a positive real part and uniformly 
bounded. By (27) g^z) tends to a limit whenever z is real and positive. Hence by 
Vitali’s theorem of convergence* <7^(2) tends to a limit uniformly in the half-plane 
on the right of the imaginary axis. This limit is an analytic function regular at 
every a with a positive real part and vanishes whenever 2 is real and positive; 
therefore it vanishes identically. Hence 


hm 

n-^ooj 


g-ai+itc 


•to 

p.„{x)dx= p{x) dx (28) 

Jo 


for all a > 0 and real t. Let 

4v(a)=| e'‘‘=‘pAx)dx, L(a) = | e-‘‘^p{x)dx, 
Jo Jo 

hm L„(a) = L{a). 


80 that 


It follows from (28) and (29) that 


(29) 


If” If" 

hm Y-j—\ e"'““+"'“p„(M) = yt~\ e-““+““p(w)c^^6, 

n->-to-Hi(®) J 0 

whence, by a well-known property of the characteristic function, 

1 


whence 

whence 


lira - 
lim 

n^oo, 


e“““p„(M) du = j 


'^p{u)du, 


g-aup(y) 


hm hm 

a-^-H-O n-yoo 


fa: 


e-‘^'^pAu)du - \ p{u)du. 


The order of the above repeated hmit can be interchanged because 


i: 


e~‘*'^pAu)du~i~ 


pAu)du 


uniformly in w as a ^-f- 0. This completes the proof. 

* Titohmarsh (1932), p. 168. 
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We shall now derive an expression for the Laplace integral ^(exp ( - aF)). To 
simphfy the notation we shall use a single letter, e.g. y to denote a set of variables, 
e.g. all the when they figure as arguments of a function. We shall write dy for 
IT dy and a single integration sign for multiple integrals with respect to the variables 
from - 00 to 00 . We denote the integral <^(exp ( - aF)) by L{a). 

The following equation can be verified by direct integration, on referring to 
(10) as the definition of F : 

exp ( - = jfiy, z, a, i) di, 

where 

f(y, z, a, t) = (27r)-!P«i [ exp I ^ ^ S hjhj + *« S S yirk\ > 

ni 

~ •••ip)) 

r=l 

and where the letter i, when not figuring as an index, stands for ^/-l. Denoting 
by p(y, z) the coefficient of IJdydz in (21), we have 


Lila^) = jp{y,z)dydzjfiy,z,a,t)di= dt jp(y,z)f(y,z,a,t)dydz, (30) 

as the change of the order of the integration is obviously legitimate for every real a. 
By direct substitution it results 


P{y> «)/(?/> 2, a, t) dydz = (27r)-JP(2ni+»i)exp ( - |lF)/i(a, t)f^{a, t), (31) 


r [ 1 P * /— P n, 

where Ua, <) = exp - - S «« + S \'Kyii + »“ 2 2 ^*2/* 

J I i=l i=lr=l 

fziaj) - jl 6,ij.li’Hexp 


dy, 


IP IP 

v-' • * — 

2 i=i 


2 0 2 




The integral (32) is readily evaluated and runs 


/i(a, t) = (27r)i^'‘i exp {\W ) exp ] - E 2 4 “ 2 

i=l ?■=! i=l 


(32) 

(33) 

(34) 


The integral (33) can be evaluated by using Wilks’s formula for the moments of 
the generalized variance.* The result is 


/^(a, t) = 2i^>"i(27r)*»'' if | \ 

where = 1 if t = j and 0 otherwise, and where 

K= + 

rUn-i + l) ' 


( 36 ) 


* Wilks (1932). 
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From (34), (35), (31) and (30) we obtain 

i(^a2) = f| %+sy l-J'''i+»)exp ■ - K i S VAilfil dl, 

J -(=1 J 


whence 

L{a) = iT-^PHE. 






i =■! 


i«l 


dl. (36) 


It may be observed that the right-hand side of (36) involves no sum of n terms; 
hence it is particularly useful when we Avish to make n approach infinity. 

We may calculate the moments of F by means of (36), Thus 


S’{V) = 7T-iP”^K 

1 

~ n-p~l 


{W+pnfi, 


dt 


(37) 


= 7t~ip-iK +2 L 




+ 


A ( S VA; 


dt 


y/2 


4SA,A^ 

i+i 


(?i.-3J-l)(w-p-3) {n-p){n~p-l){n-p-Z) 


+ 2W 


pni + 2 


•2('F-1)(%-1) 


•f'; 


{n-p~l){n-p-^) {n~p)(n-p~l){n~p~Z) 
pnfpni + 2) 2(p-l){n^-l) 


(ft-p- l)(n-p-3) (n-p){n~p~l)[n-p-2y 


(38) 


Remark. Consider again the case of k samples (see Example on p. 221) and 
write Ifc and correspondingly. It is seen from (37) that the statistic 

{M-k~p-l)Vk-p{k-l) 

is an unbiassed estimate of W^. We may regard F*. as a generalization of which is 
the “Studentized” statistic* but for a constant factor. is, of course, also 
identical with Hotelling’s generalized “Student’s” ratio except for a constant 
factor. 

We shall next study the behaviour of F as w->-oo. As n is now allowed to vary, 
we shall attach to various letters the index n. In particular we write for W 
to emphasize the fact that the value of W depends also on a fact which is not 
brought out by the formal definition of IF. This is because the as we explained 
on p. 221, are themselves linear functions of the original population constants 
with coefficients depending upon n. 

* Bose and Eoy (1938). 
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It is in general true that, except when is true, in which case is 

either 0(1) or 0{n) as 72.->oo. That both are possible is seen in the Example on 
p. 221 for k = 2; is 0{n) or 0(1) according as = mj-^oo or one of Wj 
remains fixed while the other tends to infinity. 

Theorem 4. If — 0{n), then the mean of V is ^ F„ + o|"| and the variance 
of V is j • Hence the random variable F - - 0 m probability. 


This is an immediate consequence of (37) and (38). 

If = 0(1), and if tends to a limit as n-j-co, the limiting distribution of 
nV will be that of 8 (cf. (2)) with IF replaced by its limit. This is our next theorem. 
We assume now that 

lim!F,= !Fo. (3!)) 

n->oo 

The case !F,i s 0 (i.e. //q is true) is covered by (39) on putting W(, = 0. But (39) 
may also be regarded as covering the case W,^ = 0(n) in the sense that the repeated 
sampling is not made from the same population, but from populations whose 
distribution constants vary with n in such a way that (39) holds true, Thus, for 
example, in the case of two samples with (see p. 222) we have 

!F„ = = \[n+2)0y 

V 

where = Xl ay(C£i-C<2)(Cji"^32)' 

i,j = l 

Here we have 1F„ = 0{n) if the population constant <!>>(). will be 0(1) if we 
consider 0 as decreasing to the order n-^ as n increases indefinitely. This idea of 
regarding population constants as varying with the sample size can be found in 
the works of Fisher* and Neyman.| 


Theorem 5. If !F„ tends to a limit IFg as n->co, then 


where 


lim Pr{nV ^ x} 

?l->co 



CO 

p{x) = 

A=0 


^’^h\r^+\pnf}' 


(40) 

(41) 


In particular, IFq = 0 {i.e. is true), the limiting distrib^iiion of nV is that of 
with pn^ degrees of freedom. 


Proof. Write Lf\a.) for the Laplace transform of the probability density 
function of nV, so that 

= (f(exp(-a'n,F)) = L{no(.). 


* Eisher (1928), p. 663. 

t Neyman (1937), pp. 169-70, (1938), pp. 70-1. 
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If we replace a by no. in (36) and subsequently make the change of variables 
tir = n'hi,, we get 

Lf (a) = 




l-l(«i+n) 


where /C is now written for K and 


exp j - a S - i V(2a) S fXijJ dr, 
{“l i=l I 

(42) 


Hi 


(bi~ I) 2) •••iP). 


r=r 


We shall now find the hmit of as »->oo. We have, using Stirling’s formula, 


lim 


For every system of fixed values of the we have 


(43} 


whence 


lim 

71-4- » 


n 


= i+i 1:4+0 

n-i^l 


W 


2 > 


% 




= exp(-^ S4). 


(44) 


Calling 4 the integral in (42), we have 


1 

4= ' 




w 


= ^n+-®w 




- exp ( - IS «ii) j exp I - a S - i V(2£x) S VAi 
+ [exp ( - (I + a) S Sff - i V(2a) S dr 

J V i=l 1 


(45) 


The absolute value of the integrand of is less than 

2exp|-aS4)- 

Hence by (44) and dominated convergence, as ?i->-co. The value of is 
easily found to be 

= (27r)‘J’’'i ( 1 + 2a)->P"i exp | j ■ 

From (42), (43) and (46) we obtain the result 

Jim Lf^ («) = (! + 2a)-‘J'>'i exp | - . (46) 

If p(a;) is defined as in (41), then the integral 

/•oo 

er'^f{x)dx 

Jo 

is identically equal to the right-hand side of (46). Theorem 6 is thus proved on 
remembering Lemma 2. 
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7. Behaviour of W when is not necessarily true and n is large. We shall 
establish the following theorem. 

Theorem 6. 7/ = 0(1), the statistics nV and -w log If tend to be certainly 

identical in the sense that 

hm (nF + w log IF) = 0 in probability. (47) 

n -^00 

Corollary. If tends to a limit as n->oo, —w log IF has the same limiting 
distribution (40) as nV as n->-cx). 

Remark. Besides the Corollary there is another significance to be attached to 
Theorem 6 . Since the test functions V and Pf are not functionally related (except 
when p = 1 or = 1 ), there is the question of choosing one of them to be con- 
sistently used in carrying out the actual tests. This can only be decided by a 
comparison of their power* to detect the falsehood of when the 7]^^ do not all 
vanish. While this may be a difficult problem for small samples, Theorem 6 appears 
to have answered the question for large samples. In fact, if n is large, it is almost 
certain that the values of nV ajid -w log IF calculated from the sample will differ 
very little. That -nlogW and nV tend to have the same power function is 
a consequence of the Corollary. 

Proof of Theorem 6 . For every d such that 0 ^ 0 1 we have 

Hence, remembering ( 12 ) and (13), 

1 t / ^. \2 1 / t 0 . \2 

-H*, 

whence, for every 57 > 0 , 

Pr{| nF+wlog W\>y} = Pr{wF+wlogIf >i;}<Pr ||f^> 9 ; 



because of (38). This establishes (47). 

Proof of the Corollary. Let 

F^ix) ~ Pr{nV < x], OJx) = Pr{-n log IF < a;}. 

Then for every x and 5 / > 0 we havef 

0 < 0^{x) - F^(x) < F^{x + 7i)-F^{x-y) + Pr{n F + n log If ^ y]. 

* Neyman & Pearson (1936, 1938). 
t Prechet (1937). p. 164. 
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Let F{x) be the limit of F^x) as n~^co, which is known to exist by virtue of 
Theorem 6. We have then 

0 ^ - F„{x) ^ j ^;(x + - J’(a: -f i?) 1 + 1 F^x -rj)- F{x-')i)\ 

+ F{x+i^) - F(x~ij) + Pr{nV + wlog W ^ (48) 

Given any e > 0, choose and fix an 9/ > 0 so small that F{x + 7i)- F{x~7j)<e. By 
(47) the rest of the terms in the right-hand side of (48) is smaller than e for all 
sufficiently large n. Hence OAx)~Fn{x)-^0, i.e. OAx)-^F(x), which completes 
the proof. 

SUMMABV 

The Wilks-Lawley hypothesis concerning population means of multivariate 
normal populations is put in the canonical form Hg (§ 1 ), and, assuming the popula- 
tion variances and covariances known, the test function 8 is derived together with 
its exact distribution (§ 2). In the case where the population variances and co- 
variances are unknown, two possible test functions, denoted by V and W are 
considered (§§ 3 and 4), and their distributions in certain special cases are given. 
In § 5 it is shown that the sample space can be so transformed that all the variables 
are independently distributed and that only a minimum number of unknown 
parameters remain. These parameters are the roots of a certain determinantal 
equation; the hypothesis Hg, and the hypotheses of colinearity and coplanarity 
of populations, all specify the values zero for all or some of these parameters. 
Returning to the test functions V and it is shown that as the sample size n 
increases indefinitely the two functions nV and -nlog W tend to be certainly 
identical ((47), § 7), and both of them tend to have the same distribution function 
as;S((40),§6), 
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THE DERIVATION OF THE FIFTH AND SIXTH 
MOMENTS OF THE DISTRIBUTION OF b, IN 
SAMPLES FROM A NORMAL POPULATION 


By C. T. HSU AND D. N. LAWLEY 
Departmtnl of Statistics, University College, London 

1. Introduction 

The method of combinatorial analysis was first introduced in 1928 in a paper by 
R. A, Fisher (1929). In this paper Fisher defined new symmetric functions 
k^, ifcj, k^, ... of the observations for samples of a given size, and gave simple rules 
for determining the cumulants or semi-invariants of their joint sampling dis- 
tribution. This method is especially valuable for the case where the sampled 
population is normal, and in this case its use reduces considerably the labour 
of deriving the higher sampling moments of the distribution of product 
moment statistics. Two further papers appeared later, one a joint paper of 
R. A. Fisher and J. Wishart (1931) in which further rules were given, the other 
by Wishart (1930) in which he described applications of the theory and gave a list 
of higher order formulae for the normal case, By means of these formulae 
E. S. Pearson (1930) was able to derive the first four moments of the sampling 
distribution of the statistics and for the case when the sampled popula- 
tion was assumed to be normal. 

The object of the present paper is to derive formulae for /((4^) and /c(4®) 
(quantities which are defined below) and then to use these to determine the fifth 
and sixth moment coefficients of the distribution of b^. This work is preliminary 
to further investigations regarding the sampling distribution of this expression. 

If Xi, be a sample of size n from a given population, then adopting 

Fisher’s definition of the symmetric functions k^, k^, k^, ki, we have 

h = 


° (Y-3 ) t’‘+ D-i}, 

1 «. in 

where m, = ~8lx-xY, x = ~8x. 

The coefficients k,. are chosen so that E{k,.) - x,, where is the rth cumulant of 
the given population, and the pth cumulant of the distribution of k^ is as usual 
denoted by K{r^). 
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K(rP) is expressible as the sum of a number of terms of order jyr consisting of 
powers and products of the coeffioients k^, Xg, . . . , and to each such term there 
corresponds a number of two-way partitions whose coefficients have to be 
evaluated. As in this paper we are, however, assuming the sampled population 
to be normal, the only non-vanishing cumulant is Xg, which is equal to the 
variance of the distribution. The only term, therefore, which has to be evaluated 
is the one containing a power of Xg. 

2. The dbbivation of x(4®) 

For a full explanation of what follows the reader is referred to the three 
papers by Fisher and Wishart already cited. 

To determine the formula for x(4®) we must And the coefficients of aU the two- 
way patterns for the term in kI^. There are altogether five such patterns, which 
are given in Wishart’s paper, and they are reproduced below. In each pattern it 
will be noted that there are five corners with four arms attached, each comer 
representing a k^, and ten connections between the pairs of arms, each 
connection representing a Xj. 

ft 1 O 

ABC 

Fig. 1. 

For each of these five patterns, we must determine ( 1 ) the numerical coefficient, 
(2) the 71-coefficient. 

It is not necessary to give in detail all the working required for doing this, but 
we shall give two examples for each process and also a summary of the results. 

(1) The numerical coefficient is obtained by enumerating all the ways in 
which the pattern can be connected up, regarding as a separate entity each corner 
and also each arm attached to that corner. 

(a) Consider first the pattern A. 

p 

Q 


Fig. 2. 

There are ^ x 4 ! = 12 ways of determining the cyclic order of the five corners. 
There are six ways of choosing a pair of arms belonging to corner P to connect 
up to Q, and similarly for the other four corners. 



D 
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Finally, there are two ways of joining up the double arms between each pair of 
adjacent corners. The numerical coefficient of A is therefore 1 2 x 6® x 2® = 1 2 x 1 2«. 
(6) Consider the pattern D. 




Fig. 3. 


There are five ways of choosing which corner is to form the centre T. 

There are three ways of choosing the pairs of corners P, Q and B, S which are 
linked up by double arms, and two ways of connecting up these two pairs. 

The arms of corner T may be arranged in 4 ! = 24 ways, while those of each 
of the other corners may be arranged in 12 ways. 

Finally, there are two ways of connecting up each of the two double arms. 

Hence the numerical coefficient* of D is 

6 X (3 X 2) X (24 X 12*) x 2* = 240 x 126. 

We may remark that the five normal patterns for /c(46) can all be obtained from 
those of /c(4*) by the insertion of a fifth corner. This fact allows us to derive the 
numerical coefficients by a different method, which may be used as a check. This 
method must, however, be employed with care as otherwise it is liable to suggest 
a wrong result. We shall illustrate it by considering again the pattern A. 

The pattern A may be formed from the smaller pattern which has a numerical 
coefficient of 62208, by breaking one of the double arms and inserting a new 
four-way corner. 



Fig. 4. 


The break may be made in four places and the arms of the new corner may 
then be joined up in 24 ways. The numerical coefficient of A would thus at first 
sight appear to be 62208 x 4 x 24 = 24 x 12®. 

It must, however, be remembered that when for instance the break in the 
smaller pattern is made between P and Q, the broken pattern corresponds to two 
arrangements of the unbroken pattern, since there are two ways of joining up 
the double arms between P and Q. Hence the true value of the numerical coeffi- 
cient of A is in fact half the above number, i.e. 12 x 126. 

* There is apparently a mistake in Wishart’s paper, the numerical coefiScient of D being given 
there as 120 x 126. qJ course mean that the approximate value of /c(46) found in that paper 

is incorreot. 
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(2) The ^'-coefficient. 

In a previous paper (Fisher & Wishart, 1931) rules were given for 
deriving the w-eoefficients of a given pattern from those of non-vanishing 
patterns of a lower order which are already known. Thus we can derive the 
^-coefficient of A, which is one of the patterns of /c-(4®), from those of two 
oi the patterns of /c(4^). This is illustrated in Fig. 5, the Greek letters below 
each pattern are used to denote the ^-coefficient of that pattern . 



a = 2 


(- 1 ) 




n{n + l) 


{n~i){n~Zy ' {n~\){n~2){n~Z) 


r. 


and we also know that 




l)w* — 21?i2 — 14w-f4 


and 


, _ 


Hence we obtain the result 


^(^+1) f I 

^ ~ (n--l)*(?i,-2)^(n-3)*^^ 

This is the w-coefficient of A. 
Similarly for pattern B. 


1 Iw® -I- 46)1* - 86)t® -f 70n® — 36)^ -f- 8}, 
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Here we have 


A = 


?t+ 1 


/t = 


(n-l)(m-2)(n-3) 

n 




and 


Hence 


A = 


n-l 

n{n+l) {n^ — 5n + 2) 
(■n—l)^{n-2f (w- 1)®' 
n'^{n+ l)®(?i^-5w + 2) 


(w-l)Mw-2)3(w-3)3‘ 

This is the n-coefficient of B, 

The complete results are as follows: 


n.-ooeificient 

-^^l(7^+l)/{{a-lp (m-ap (»-3)'} 

— ll7i‘+45n''- 86?i“ + 70n.“— 36n-|- 8 
tt“- 9«.® + 23»'*- 7»i“-287i“+12n 
u«-12«‘ + 62)ii- 94 w'' + 5971*-30??.+ 8 
n“ — 1311® + saw"- 103m^ + 48?i*-24?i+ 8 
n* — 16 m*+ 80»‘‘ — l60»® + 76r!.®-41n+ 12 

The formula for /f(4^) is calculated by finding the sum of products of the 
numerical coefficients with the w- coefficients. We thus obtain the result 


Numerical 

Patteni coeffioient/ia® 
A 12 

B 80 

C 120 

D 240 

E 32 


K{i^) = 




- 10,67 + 4:900n« - 2536n + 840} 40 . 


3. The derivation oe a-(4®) 

There are altogether 17 normal patterns for /c(4®), which are shown in Fig. 7. 

All the patterns except the last two (R and S) may be obtained from those of 
k:( 45) by the insertion of a sixth corner, while the patterns R and S may be obtained 
from those of K{i^) by the insertion of a pair of corners. The numerical coefficients 
may as before be calculated in two ways and we shall content ourselves with a 
summary of the results. We shall, however, give two examples of the derivation 
of the n-coefficients. That for the pattern A is shown in Fig. 8. Here we have 
(-1) 7l(»+l) 


a = 2x 




(71-2) (n-3) 
n{n+]) 


(w-1)^ (w-2)^ (w-3)' 


' (n-l)(n-2)(n-3)^’ 

(n« - 1 ln® + 45«,‘ - + W ■ 


•36n + 8}, 


y = 






(?r-l)(n-2) (7 i- 1)(«,-2) (?i - 1 (n - 2)^ 


Hence 


(w- 1)* (71-2)*‘ 

w(w+ 1) 


{n-\f{n-2Y[n-if 
This is the n-coefficient of A. 


{n« - 14n’ + 78n“ - 220^5 + 341n* 

- 310wH 212^2 -88n + 16}. 
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Similarly for tlie pattern S. 


-if — 

X 


A = 

/i = 

and = 

Hence the ji-coefficient of S is 

A = 


A Ik, 

y- 

Fig. 9. 

(n+i) 

(?i-l)(n-2) {n-if' 


n 

n-l 


V, 


n^n+ 1)^ 


r~j 

'L.J 


V 


(?i-l)3(ft-2)2(n-3)2‘ 

w^(w+ 1)® 


(n- l)®(?i-2)3 (w-3)®‘ 

The complete results are given in the following table. Here c denotes the numerical 
coefficient divided by k, where ^ = 2^* x 3^ x 6 = 6,636,620 and is the coefficient 
of n' in the expression for n-coefficient divided by N, where 

n{n+l) 


N 


(w-1)® {n-2f {n-^f 


Table of coefficients 


Pattern 

C 

a. 

0, 

<*6 

<h 

“4 

«s 

(Z-2 

rti 

Oo 

A 

27 

1 

-14 

78 

-220 

341 

-310 

212 

- 88 

16 

B 

216 

1 

-12 

64 

-100 

33 

96 

- 80 

24 

0 

0 

324 

1 

-15 

88 

-264 

373 

-279 

178 

- 76 

16 

D 

324 

1 

-14 

70 

-140 

63 

138 

-116 

24 

0 

E 

162 

1 

-16 

90 

-266 

389 

-267 

160 

-176 

16 

F 

648 

1 

-16 

99 

-292 

403 

-236 

163 

- 64 

16 

G 

216 

1 

-16 

104 

-324 

449 

-208 

no 

- 62 

16 

H 

432 

1 

-13 

60 

-106 

17 

111 

- 62 

24 

0 

J 

1296 

1 

-16 

100 

-300 

421 

-236 

126 

- 64 

16 

K 

216 

1 

-16 

103 

-320 

451 

-220 

101 

- 62 

16 

L 

324 

1 

-16 

99 

-292 

403 

-236 

153 

- 64 

16 

M 

2592 

1 

-17 

111 

-338 

461 

-193 

101 

- 62 

16 

N 

432 

1 

-13 

60 

-106 

17 

111 

- 62 

24 

0 

P 

1296 

1 

-18 

129 

-442 

675 

-342 

196 

- 94 

24 

Q 

432 

1 

-19 

144 

-621 

826 

-386 

210 

-107 

28 

R 

72 

1 

- 8 

18 

4 

- 47 

12 

36 

0 

0 

S 

16 

1 

- 8 

18 

4 

- 47 

12 

36 

0 

0 
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The formula for./c(4®) derived from these results is 


/c(4«) = 6,635,520^ ^ -. ^ l■ J^^^^^^^_^ -J9025?^8-l45,532>^^ + 920.61Q?^° 

~2,775,248uH3,769,853ft^-l,717,U6nH946,872Ti2 
-476,064^ + 136,080} 4®. 


4. The moments oe the distribution oe 62 

The formulae of /c(4®), /c’(4^) and /c(4^) have been found previously and we will 
give the results: 


/c(42) = 


24w(«, + l) . . 

{n- \)(n-2) {n-2,y^' 


wia^ imn{n + l){n^-^ + 2) 

^ ’ (?i-l)2(w-2)2(ri-3)2 

6912n(M+l)(53#-428n3 + 1026w2-474w + 180) „ 
(n-l)^(n-2f(n-3)^ ^ 


We may now calculate the moments of the distribution of as follows: 

= a(42), 

/4(43) = ^(43), 

/4(4*) = /f(4*) + 3/fa(42) 

= / I t 0^3 

(7j,-l)®(w-2)3(n-3)®'' 

/t(4®) = k( 4®) + 10/c(4^) /c(4®) 

= 7 1402 ^ 8 - 17,5l6n^+75,870n* 

- 128,205n3 + 69,00071* - 30,4927!l + 10,080} 4®, 
/i(4«) = /f(4«) + I6/c(4*) a(4*) + 10n*(43) + 16<8(4*) 

= (^10 + 770^9 ^ 27 8,35971® - 4,603, SOSti’ 

(77-1)5(71-2)6(71-3)8^ 

+ 29,339,666776 - 88,717,430776 + 120,300,577776 - 66,076,78877® 

+ 30,366,02877* - 15,250,46477 + 4,364,560} 4^. 

The quantity is defined to be and, using the definitions of 4 and 4 given 
in § 1, we see that 

h - I 

* 77*(77+l) 777| 77+1 

_ (w - 2) (77 - 3) ^4 3(77- 1) 

■ ~ (77-1) ( 77 +1)^1 77+1 ’ 
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Thus the average value of is ^n-l)|{n+ 1) and, for r > 1, the rth moment of 
the distribution of 62 is given by 


Mh) = 


(w-2) ( n ~ S ) 
( n ~ l )( n + l ) 


*) 


(«- 2 ) 

(«-l)(n+l) 


/i(4'-2“0. 


Now Fisher (1930) has shown that 


5“4*'3«) = 5^4*3“2-’-) 4 , 


(n- 1 )-- 


and in particular 




Thus 


Ar(^a) = 


{n-iy{n-2y{n-iY 


fl{4r) 


(tt+l)^(n-l)(n+l)...(n + 4r-3) ' 

Hence, finally, we obtain the following results; 

24n(n-2)(u-3) 




(w+l)2(w + 3)(nH-5)’ 


1728w(w-2)(?r-3)(n^-5wH-2) 

(w+l)“(w+3)(n + 5)(n,+7)(n + 9)’ 

1728w(n ~ 2) (w - 3) + 207w< - 1707# + 4105n2 _ 1902 w + 720) 

(w+l)*(w + 3)(w + 5){w + 7)(n + 9)(?i+ll)(ri+13) 

/t5(6a) = ■■■ p — — {5# + 1402w®- 17,516# + 76,870# 

(n+l)5(n + 3)(n + 6)...(»+17)^ . > > 

- 128,206# + 59,000# - 30,492w + 10,080}, 

+ 29,339,666#- 88,717,430#+ 120,380,677#- 66,076,788# 

+ 30,365,028#- 16,250,464n + 4,364,660}. 


5 . Check with McKay’s eesxjlts in the case w = 4 

Since the formulae which we have obtained are somewhat complicated, it is 
desirable to provide some sort of check on them. This is supplied by using the 
results obtained by McKay (1933), who found in an exact form the distribution 
of 62 for n = 4, i.e, for samples of size 4. 
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Putting X — b^, the distribution function /(a:), of x, is given by 

when 1< a: ^ 2 , 


2(9a;-17) 


1 ; ^ 


when 2 < a: < I, 


where 


9a: -17 
(7-3a:)*‘ 


The moments of w = - i about the origin are given by 


Hence 


Vi,(w) = 


1 



|p(^ + 5 -) r(k + 1 ) + 


lkr(k-l)rik + 2) 

2 4(1 !)2 


1 3A;(ib-l)r(/t-f)r(/J: + 3) 

■^ 2’2 42 ( 21)2 ^ 



V 2 M = 


1 

3.7 


2 

21’ 
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f'M = 


61 

5.7.11.13’ 


Vi(w) = 


277 

5.7.11.13.17’ 



79 

3.7.13.17.19’ 


46,889 

~ 5. 7. 11. 13. 17. 19. 23' 

The moments about the mean of w are given by 

~ 3.52.7 "" 3.52.7’ 

. /. s -3.2* 

~ 52.7.11.13"52.7.11.13’ 

1424 2*. 89 

A4W - 6*. 7. 11. 13.17 ~ 6*. 7. 1 1.13. 17’ 

, ^ -42,624 -2’. 3.37 

M ^) - 56,3.17.19" 5*. 17. 19 ’ 

76,096 2®. 1189 

~ 6®. 7. 11. 13. 17. 19. 23” 5®. 7. 11.13. 17.19.23’ 
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« 

Thus for the moments of () 2 > we have 



/“#!) = W = 54 



-P’.Ul 

5‘.7.11. 13,17. 19’ 


/l,(ilj) = i%{w) = 


2'®. 1189 

5‘.7.11.13,17.19.23’ 


It may be easily verified that the same results are obtained by putting n = 4 
in the formulae given at the end of the preceding section. There is thus reason to 
believe that these formulae are in fact correct. 
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TESTING THE HOMOGENEITY OE A SET OF VARIANCES 

By H. 0. HARTLEY 


1. Introduction 

When analysing data the experimenter is frequently faced with the necessity 
of testing the homogeneity in a set of estimated variances. When it is desired to 
combine a number of variances to obtain an estimate of the common variance 
it is necessary to apply such a test. Again, if a selected “treatment mean square ” 
is to be compared with an “error mean square’’, a test for homogeneity has 
recently been proposed (Wishart, 1938) as a safeguard against the selection of 
the largest mean square from a set of random ones. 

For general use in such cases Neyman & Pearson (1931) have suggested a test: 
the test. The statistic used in this test has been modified by Bartlett (1937) 
and generalized by Welch (1935, 1936). From recent work (Nair, 1938; Bishop 
& Nair, 1939; Pitman, 1939) it would appear that Bartlett’s statistic /t is the 
best to use, because it is unbiased in the sense defined by Neyman & Pearson 
(1936, 1938). Or more precisely, the test in its original form is biased with 
regard to the admissible set of alternatives. 

Some difficulty has been experienced in obtaining the random sampling 
distribution of this statistic which is required for a test. Various approximations 
have been worked out. There are 

(a) Bartlett’s (1937) approximation using the distribution. 

{b) P. P. N. Nayer’s (1936) approximation obtained by fitting Pearson-type 
curves to the distribution in the special case where all mean squares are based 
on the same number of degrees of freedom, 

(c) U. S. Nair’s (1938) expansion of the exact distribution in the special case 
mentioned in {b). 

{d) Recently another paper on the subject has appeared by E. J. 6. Pitman 
(1939). In this paper the author transforms the distribution of into a multiple 
integral which can be evaluated in special cases (small values of k) by reduction 
to elliptic integrals. 

The accuracy of the approximations (a) and (6) has recently been tested 
(Bishop & Nair, 1939) in the special case in which the expansion (c) is available. 
Bartlett’s findings were confirmed; it was shown that his approximation is valid 
only for moderate or large numbers of degrees of freedom ( ^ 3), We shall also 
show in this paper that even with this restriction for the degrees of freedom the 
approximation is not very accurate if fc, the number of mean squares in the set, 
is large. 
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While U. S. Nair’s expansion, although it is very complicated, provides 
a means of working out the exact probability integral in the special case where 
all mean squares are based on the same number of degrees of freedom, there is 
still uncertainty in the general case. P. P. N. Nayer has suggested that the test 
for homogeneity between Ic mean squares with/^ degrees of freedom (i = 1 , 2, . . . , fc) 
is (under certain conditions) identical with testing the homogeneity between k 
mean squares all of which have f degrees of freedom, where/ is the arithmetic mean 
of the f. We shall show that, although there is some truth in this statement, 
the harmonic mean should be used for /rather than the arithmetic mean. 

Since Bartlett’s approximation does not provide a test of sufficient accuracy 
in all cases, the main difficulty of dealing with the general case has been the 
large number of quantities on which the exact distribution depends : if the k mean 
squares in the set have / degrees of freedom respectively (J= 1,2, the 
distribution would depend on A:+ 1 quantities. We shall now show in this paper 
that (provided /(^ 2) there is an approximation of sufficient accuracy which 
depends on three quantities only. These three are: 

(i) k, the number of mean squares in the set; 

(ii) Cl = i 7-4. where I = S/; 

<=i/i 

This makes the distribution amenable to tabulation, so that the test can be re- 
duced to an inspection of a table of 5 % and I % points which can easily be carried 
out by the experimenter. 

In the case where mean squares having one degree of freedom occur in the 
set, the distribution is of a more complicated character, but our approximation 
is still fair. 

2. The formal solution 

Consider k normal populations with variances cr|(< = 1 , 2, . . ., /c). Let s? be an 
unbiased estimate of (t\ based onf degrees of freedom, and let us denote by F 
the total number of degrees of freedom, 

(1) 

«=i 

Bartlett’s statistic /i is then given by 

-21og/i = P log {E(/(af)/i?’}-E/< logs?. (2) 

i t 

The equivalence to a special case of the generalized statistic (Welch, 1935, 
1936) is expressed by the relation 


F\ogL{ = 21og/i, 


( 3 ) 
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where 




w 


For our test we require the random sampling distribution (/){L[) of the statistic L[ 
under the null hypothesis 

cr? = (r2, 

Under these conditions it has recently been shown (Welch, 1936) that the (g' - 1 )th 
sampling moment of L[ is given by 


^ 2-1 = jy { L [) L ' r ^ dL [ 


^ IFY^-i)hw r(hF) 


(5) 


From general principles it may now be inferred that equation (5) is valid for 
all complex q with 

Real Part of g > 1 . 

Further, by Mellin’s inversion formula, we obtain from (5) 

m) = iw) n FM)-^ 

(=1 


1 

2ni 


•Q+ito k r / F 

n 7 

Q—i<o i=l [^\Jl 




r 



L[-^ dq 

F 11 

_ F(^F + q-l) 


.(6) 


where $ (> 1) is an arbitrary positive quantity. 
Introducing as a new statistic 

X = - Flog LI - - 2\og/i, 

and as a new variable of integration 

X = i+{q-l)IF, 

we obtain for the distribution function of x {^(x) say) 


.(7) 




-1 p-{x 


I r/i+ioo £ ((FVh 


^ 27riJ^-i« <Pi Wfi 


Fm 




F{FX) 


dX, (9) 


where A is an arbitrary positive quantity. 

Using now Binet’s integral representation of logF (Whittaker & Watson, 
1927, p. 249), we may write equation (9) in the form 


where 


F(A) = 


o\2 


/*/l+tco 

J A—i«> 

....(10) 


....(11) 
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Introducing 


i/W =b-r+ 


2 T e’’-1/T’ 


.( 12 ) 


we see that g{T) has continuous derivatives of any order for 0 ^ t ^ oo. We may 
therefore transform the integral (11) by integration by parts, differentiating 
g{r) and integrating the exponential functions. We obtain 


m 


1 


r 

1 


I--1 

12A 



360A» 

LSl/fJ 



+ 


A^Jo lh\ f^ l~ J 


.(13) 


We now approximate to E(X), and therefore to f{x), by ignoring the last summand 
in equation (13), and write 

= i‘*) 

1 1\ I *=/l\l 

where 

It can be shown that this approximation is sufficient for all practical purposes 
provided 

Substituting (14) in (10), expanding e^^‘> and integrating the single terms 
we obtain* 


00 /I 1 ^ 

^(a:) ^ X hi) 

i-o \ 2 / 


..(16) 

..(17) 


where the are the coefficients of the expansion 
in ascending powers of t. 

Prom (16) it is obvious that (to the degree of accuracy considered) the distribu- 
tion of a: is a weighted sum of distributions with degrees of freedom ranging 
between k~l and oo. We now denote by Py(J) the probability integral of 
based on j degrees of freedom, i.e, we introduce 


P/J) = p|ij ' 2-»/ J“ a:‘«-»e-‘“da!. 


.(18) 


We further denote by P(X) the probability integral of our variate x defined 
in (7), viz. 

P(Z) = J \lr{x)dx. (19) 

♦ We make use of the well-known integral representation of l/r( 2 ), viz. 


{P(*)}- 


A+io) 


= -![ , 

2n%} A-ia> 


ef’p”’‘dp. 
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From equation (16) we obtain by integration 

P{^) = S X ( S > (20) 

1=0 \i=o / 

where the oi^ are the coefficients of the expansion of 
in ascending powers of J. 


3 . TABTJLATIOIf OF PERCEKTAGE POINTS 

Equation (20) provides a means of calculating tables of the probability 
integral P{X) (or its 5 % and 1 % points). For the quantities P^{X) are given 
by Elderton’s tables of the probability integral of x^> while the coefficients 
are readily obtained from the expansion of (21). 

For practical purposes tables of the 6 % and 1 % points could be prepared, 
These percentage points would depend on three quantities, viz. 


k, 







^S- 


The effect of Cg is small, and it would be convenient to make k and the respective 
row and column headings of two-way tables of percentage points, and to prepare 
such tables for two or three selected values of Cg. It is hoped to prepare such tables 
shortly. 


4. Comparison with U. S. Hair’s expansion 

It would lead us too far afield if we gave here a complete mathematical proof 
of the accuracy of the approximation (20). It is, however, of interest to check 
the accuracy in a few cases numerically. U. S. Hair’s expansion mentioned 
above will be used for this check. The most stringent test of the accuracy of 
equation (20) is established by choosing the /; small and k large, Humerical 
results have been obtained from U. S. Hair’s expansion (2) in the case/; = / = 2; 
k = 10. The result of the test is given below. 


Lower percentage points of L[ = 



5% point 

1 % point 

(a) Bartlett’s approximation 
(c) U. S. Nair’s expansion 

0-367 

0-277 

0-376 

0-288 

{d) Equation (20) 

0-378 

0-291 


The agreement between U. S. Hair’s expansion (c) and equation (20), (d) is 
satisfactory in this case where the approximation would be expected to be worst. 
For comparison, Bartlett’s approximation (a) is also shown. 

Biometrika xxxi 17 
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5. The helatioh between the special case ft=f,t = l,2,...,k 

AND THE GENERAL CASE 

P. P. N. Nayer has considered the general case, and has provided some evidence 
for believing that this case can be reduced to the special case provided that 
the fi are not too small and not too dissimilar in value. He has suggested 
nsiTig the mean of thef as a substitute for the common value/. It is easy to see 
from the approximation ( 20 ) that there is some truth in Nayer’s conjecture. 
However, it is not correct to use the arithmetic mean. The correct value is given by 



and is approximately equal to the harmonic mean of the /. If all/ > 4, the general 
case of unequal / can always be reduced to the special case / = /, no matter 
how dissimilar the/. For if in equation (21) we replace C 3 by and consider 
the function 

eJc#- 

we find that the coefficients of this function when expanded in ascending powers 
of t will be approximations of sufficient accuracy to the coefficients in ( 20 ). 
The probability integral of X is therefore determined by the quantities k and 
so that the identity of the general case and the special case is obvious, provided/ 
is defined by ( 22 ). 

6, Some remarks on Bartlett’s approximation 
Bartlett (1937) has given an approximation to the distribution of 

- 2 log/i = *. 

He suggests as an approximate test that we enter the table of for ifc - 1 degrees 
of freedom with the statistic 

3a:(fc— l)/ci, 

where is given by (15), 

It can be shown that this approximation is equivalent to equation (20) 
provided is small, so that higher order terms in the expansion (20) may be 
ignored. For large or moderate values of however, discrepancies may occur 
even if all mean squares are based on moderate or large numbers of degrees of 
freedom. We shall confine ourselves here to demonstrating this with the help 
of a single example, viz. / = 5 and k = 30. While for values of k of this order 
U. S. Nair’s expansion is very complicated, equation (20) yields results which are 
accurate to 3 figures. Below are given the probabilities of exceeding Bartlett’s 
5 % and 1 % values; they are 

6% level 1% level 
True P(X) 0'047 O'OOSl 

Thus Bartlett’s approximation has an error of 6 % and 19 % respectively. 
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THE SIMULTANEOUS DISTRIBUTION IN SAMPLES OE 
MEAN AND STANDARD DEVIATION, AND OE MEAN 

AND VARIANCE 


By L. TRUKSA 
The, Gharles University > Prague 


this study I propose to give the application of “ the conception of the probability 
of passage" to the solution of the rather dilRcult general problem mentioned 
above. From this single example it is possible to deduce that the introduction of 
“a conception of the probability of passage” into mathematical statistics would 
at least make the solution of a range of difficult problems considerably easier. 

Let us assume that the class symbol of the statistical element is a continuous 
two-dimensional variable x, y, defined in the region Q, and that the corresponding 
density of the probability of passage from the class Xy y^ into the class a;, y 
expressed by the symbol piix^y^ x^y), depends, not only on the variables x^y 
but also on the discontinuous variable t, which is the number of operations 
executed on the statistical element. By operation we shall mean in our case the 
selection of a statistical element from the fundamental universe. 

Let the function Pi{xy yp, x, y) satisfy the following fundamental relationship : 



( 1 ) 


A further relationship, which will be used, concerns the calculations of the 
continuous two-dimensional probability distribution P^j^^{x,y), corresponding to 
the number of operations Hi, from the distribution F{Xi,yi) by means of 
Ptih>yu^>y)’ r r 


^t+i{^>y) 


^Kyi)Pi{h>yi>x,y)dxJyy 


( 2 ) 


The problem of the simultaneous distribution of the mean and standard 
deviation of samples in the case in which the fundamental distribution is given 
quite generally by a function /(»), has, so far as I know, occupied the attention 
of only one man; this was A. T. Craig (1932) in his study: “The simultaneous 
distribution of mean and standard deviation in small samples.” He introduces 
the solution only for samples of a very small number of items, % = 2, 3, 4. 

The use of the conception of the probability of passage enables us to demon- 
strate successively the solution of this problem for samples with an increasing 
number of items. The method used gives us at the same time a solution in a very 
easy manner, and especially clear, if used with a graphical illustration. 
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I 

Let us use x for the mean of a random sample of the symbols a;i * 2 . . . aj; taken 
from the continuous one-dimensional universe of density /(a:), where we have 

1 ' 

® ~ (3) 

t X 

and for the standard deviation s, which, in accordance with the definition of the 
standard deviation, is given by 

(3-1) 

{ 1 

By an extension of this sample of size t to the size i 2, we get a sample, the mean 
of which is 

(«) 

and the standard deviation 8 is given by 

(3-3) 

The elementary probability of passage Pi{x, s; X, 8) from the sample with x 
as mean and standard deviation s to the sample of mean X and standard deviation 
8 equals the ])robability of the appearance of the values and a;, + 2 ! 

/(%i) d:%2 = Pii^> s; S) dX d8. (4) 

In this expression it is necessary to substitute for the variables X/^]^, 
terms of the variables X, 8; the values *, sin substituting being taken as constants. 

From the expressions (3) and (3-2) we obtain, first of all, the following relation- 
ship; _ 

X/+ii-Xi+s = X(l + 2)-xt. (5) 

A further relationship which we obtain from equations (3'1) and (3'3) is 

1 

= S — 2tXx -j- (I H- 2) X^ q- a :^.|,2 — 2{xi^i 4- a;(_|. 2 ) X . 

1 


If we now use the relationship 

ts^ = 


we obtain another equation, 
and Xi^ 2 , i.e. 

xf^i + x 


which is necessary for . the determination of the values 
l^=.{t + 2){8^+X^)-t{sHx^). ( 6 - 1 ) 
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From equations (5) and (5-1) it then further follows that 

^ i ± i V{2[(« + 2) - ts^] -t(t + i){X-x)^}J~X-^-x±l<x, 

- 

+ 1 42[{t +2)S^- - t{t + 2) (I - x)^} ==t^X-~x + ^a, 

(5-2) 

where, for the sake of brevity, we introduce the symbol 

a = f{2[{t +2)8^~ ts^] - t{t + 2) (I - 5)2}. 

For effecting the substitution in expression (4) it is also necessary to know 
the corresponding determinant of the substitution 

-P(%i.%a) (^ + 2)2^ 

D{X, 8) 2 V{2[(i + 2) /S'2 - i52] - i (i + 2) ( J ™ 5)2} • 

Referring to the two different values for each of the symbols and Xi^^ 
we then get the expression for the density of probability of passage Pt{x,s-, X, 8) 
in the form 

Let F((5, 5 ) be the density of the simultaneous distribution of mean 5 and 
standard deviation s in the random sample of size t. 

The density of the distribution of mean X and standard deviation 8 in the 
random sample of size t + 2 is obtained by application of the fundamental relation- 
ship (2), and for its value we get the following expression: 

Xi+2(1, 8 ) = S) dxds 

( 7 ) 

For the complete solution of this recurrence relationship, besides the limits of 
integration (which we shall consider later on), we need to know the initial values 
of the function Fi{X, 8), i.e. 

F,(X,8)=:f(X), _ _ _ 1 

F,{X,8)=pS,s-,X,8)^if{X + 8)f{X~8).! ^ ' ’ 

Let us supplement these values with the following function F^iX, 8) for which 
only one integration is necessary: 

y.(X,S) . 18sf/(i)/[p-l*+lV(6il*-3(r-S)>}] 
the limits of integration will be deduced later on in this study. 
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In order to examine the limits of integration, 

(1) Let us first of all assume that the density of the fundamental universe is 
expressed by the function f{x), defined, between the limits + oo. 

In this case we set only one condition on the values that is that they 

must be real. The condition corresponding to this is expressed by the inequality 

2(i + 2) - «(< + 2) (I - 5)2 > 0. 

If we consider X, 8 as constants and x, s as variables and we use rectangular 
co-ordinates with axes x, s, these variables are limited to the region given by that 
half of the ellipse Ef.: / 4. 9 _ t a. o 

-^{X-xr+s^ = lY^"> 

whioh-lies above the axis x. According to the choice of the values X, 8, the variable 
X varies between the limits + oo; the variable s then lies in the range 0, oo. 



0 X 

The centre of the ellipse Eq lies on the axis of 5 at a distance X from the origin 
of the co-ordinates; the semi-axes have the lengths 



The integration of expression (7) with respect to x must therefore be carried 
out between the limits 



and then with respect to s between the limits 



In Diagram 1 the surface of integration is shown for the particular values X = 1, 
8 = 1’6, < = 4. 
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y 2 

■j , the semi-major 

axis Aj being always larger than A^, their ratio 

K It + 2 


A. 


/it 

; V 2 


depending only on the size of the sample. 

If calculation of the value of F^{X, 8), is involved, i.e. if i = 1, the ellipse Eq 
reduces to that part of the x axis with X±8^J2a.B end-points. The corresponding 
integration need only be carried out with respect to x, and that between the 
li^ts X±S^2. 

Formula (7) can be used, not only for the successive calculation of the simul- 
taneous distribution of the mean and standard deviation for an increasing size 
of random samples, but also for the verification of the given distribution values 
introduced for the arbitrary I, 

Thus, for example, it is possible to check the correctness of the expression 


F,{X,8)^-4i~.erW2^-^ 


i-i 


G^{27T) 




•e 20 * 


which corresponds to the normal universe 

M = ■ 


'<¥) 


c^2tt 

By application of formula (7) wo obtain 


e Z(”. 




(-1 

t\~r(t+2V 


n 


8 


((+2)(XH-S») 
e 2 c* 


c'+TI 


(^) 


(-1 

rr 


s^~^ds 


8 


dx 




8^ 52 


W < + 2 




c<+2ri 




((+2)(A''+S') 

e 2 o' 


d~^ds 

■ 1 

— flTn — 

-^x + X 1 

1^ CVi V oxix 

V2 

1/8^ fi2 \ 

J 0 


l\t 1 + 2), 


<+i 




-11^“ /H2\T- 8^ 


Gf{27T) 


\2cV Jt+r 


' e~ 2e“ 
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As can be seen, the function JJ(Z, 8) satisfies the fundamental recurrence 
relationship (7). 

(2) Let the fundamental universe be defined by a continuous function defined in 
the range 0, oo. 

Under this assumption the values must satisfy, apart from the 

condition of their being real, another condition expressed by the inequality 

i + 2 — t _ 

— V{2(i -f- 2) 2fa2- + 2) ( J - xf) > 0; 

the mean of the sample x cannot then assume negative values, which gives 

The corresponding values of the variables x, s, satisfying the first inequality 
lie outside the ellipse 

A 2 2 2 

The centre of the ellipse is situated on the x axis at a distance X{t + 2)/(H 1) 
from the origin, the semi -axes have the lengths 


a:,= 


7 + 2 


8^- 


Hlj’ 


Ai: = 


t + 2 
t(t + 1) 


\ i+i 


From the quotient -^ = ,J(i! + 1) it follows that the semi-axis A' is longer than 
Aj.. The ellipse intersects the x axis at the points 


i + 1 


t + 2 

i(t+l) 


8^- 


t+ 1 


The ellipse Sj lies inside the ellipse £!{, and touches it at the point Xj, s^, where 


X^ — X ^ , 5^ 


t 


8^- 


?) 


8>X l~. 


as long as 

From the expression for the lengths of the semi-axes it follows that E^ is real 
only as long as the condition 


8> 


X 


^l{t+i) 

is satisfied. 

Considering the condition that the mean x cannot attain negative values, we 
obtain the inequality 8^X^J(t + l). 

The system of the ellipses Ei is therefore real for those points with the para- 
meters X, 8 as co-ordinates, which lie between the straight lines 


and that for Z 0, > 0. 
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When carrying out the integration indicated in formula (7 ), it is also necessary 
to distinguish between the cases when the ellipse intersects the negative part 
of the X axis and when it does not, i,e. if 

S4lJ^ or 8iXj{. 

In Diagram 2, the corresponding regions of the values X, S are shown fort 



The surface of integration is that part of the plane x, s in the first quadrant 
contained between the ellipses Eq and E^. 

The process of integration is then as follows: 

(a) Let S satisfy the inequality 


X 

1 ) 


In this case only the elhpse E^ is real, It is therefore necessary to carry out the 

— I{2S^ 2s^ ' 

integration with respect to x in between the limits X ± 

It + 2 

respect to s between the limits 0, S 


t t + 2 


and with 


In order to illustrate the theory, formula (7) will be applied to find the 
analytical expression of the correlation surface S) for samples from the 
fundamental universe ^ o < a; < oo 


over the region in the plane X, 8 bounded by the inequality mentioned above, 
Tor i = 2 and 3 the results are given in A. T. Craig’s paper, namely 


F,{X, 8) = 4e-2^; F^{X, df) = 6 ^3 
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By application of our formula (7) we obtain 

F^il, 8) = Un8h-^ ; F^{X, 8) = 50 ^6 
Regarding these expressions let us assume the general solution to be 

A recurrence relationship for the coefficients may be easily found from the 
equation 

(+2 

Fi^,{X, 8) = 2S(H 3)® ^ 




^i+i ■” 


t+i 

(< + 2)^ 


It is then not difficult to yerify that for < ^ 2 


Ht 


t+i t 
27T^ {2 


t+1 t 


The function 


Ott 


(¥) 


8^~h-‘^, t>2, 


represents a part of the whole distribution given by the integral 




foo /’V«+i) _ _ 2v 2 Pd) 

F,lX,S)dXiS.- , »2, 


that is, if f = 2 3 

100% c.60% 

of the whole distribution. 

(6) If jS satisfies the inequality 

1 


4 

c. 30 % 


6 

c. 13 % 




V(<+i) 

the relative position of the ellipses and is shown in Diagrams 3a, 6. The 
integral in formula (7) is equal to the sura of the integrals between the following 
limits: 

(a) According to x: 


X- 


28^ 2s2 


t i + 2 


2s2\ t + 2 / 

t + 2)’ i + 1 V' 


i + 2 


8^-s^ 


J__^\ 

i + 2 i+ 1/ 
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According to 



(/?) According to x: 



According to s: 




( 7 ) According to x: 

i + 1 /J 


S^- 


^ + 2 




According to s: 



(c) }?inally if X 


- < < A ^l{t+ 1 ), part of the ellipse E„ falls to the left of 


the s axis, always being to the right of the s axis, the surface of integration of 
the values *, s is made up of two parts, as can be seen in Diagram 4. 

For the purpose of integration it is necessary to divide the integration into 
four parts :a,^,y,S. The integral of the expression given in (7 ) is equal to the sum 
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of the integrals with the following limits for the value x, and for 5 , x to be taken 
first of all: 


(a) The limits of a:: 

0; X 


The limits of s\ 

(/?) The limits of 

( 28 ^ 28 ^ 


i + 2 
4+1 

0 ; 


4 + 2 
4(4+1) 

4 + 2 


^ 2 - 


sH 

Z 2 \ 

4 + 2 

4+1/ 


AJKJ I ^ ^ 


" 4 4+2 

The limits of «: 

/4 + 2 


4 + 2 - 
t ^ 


^4 + 2 / 4 + 2 


'4+1 


8^- 


t 

( 7 ) The limits of x: 

X- 

The limits of s; 


4 + 2 
8^-~X^ 

u 


4(4 + 1) 


sH 

x^ \ 

( 4“ 2 

4+1/ 


(4-2 

T ' 


iS> 


28^ 

t 


2 s2 \ 

4 + 2] 


4 + 2 

F 


8^- 


X^ 
'4 + 1 


8 


'4 + 2 


(S) The limits of £c: 

4 + 2 


1 ~ + 


'4(4 + 1) 
The limits of s: 


8^- 


0 ; 


sH X 2 
4 + 2 4+ 1 /' 
’4+_2 
4 


^ + 


28^ 2s^ 

4 4 + 2 




If the calculation of function F^iX, 8) is especially required, that means, if 
4 ~ 1, that there is a substantial simplification in the integration. First of all it is 
necessary to carry out the integration only according to x\ besides that it is 
sufficient to differentiate between two cases only, according to whether the 

value 8 satisfies X X - 

or j^^8^Xf2, 

^ —It 

since obviously the case of- - 


< ~ drops out of consideration. 


VHI) 

If the first of the given inequalities is valid, it is necessary to carry out the 
integration with respect to x between the limits: 

X-8f2-, X + 8f2; in the case of the validity of the second inequality, the 
corresponding hmits of integration are: 

0; fX-iV{6^^-3Z*), 
fl + IV(6>S2-3Z2); X + 8^2. 

These results agree with the results quoted in the paper by A. T. Craig 
already cited. 
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(3) Let the fuTidamental universe be defined in the range 0, a. 

The values % 2 ) case are determined by the inequality 

0 ^ ®i+l) ^ ®> 

the mean of the sample satisfies the inequality 

0 < * < a. 

From the upper limit of the values it follows that 

t + 2 ~ t . - 

~X--x + y{2{t + 2)S^^ 2ts^ ~t(i + 2)(X-x)^}^ a. 

The corresponding values of the variables x, s, which satisfy this inequality, 
lie outside the ellipse E^\ 


f + 2i:) + ^ ^ ,Sf2 _ (i±M±^ ^2 j „ a2_ 

a 2i t 

Its centre lies on the x axis and is at a distance from the origin equal to 

"^< + 1 t + 1’ 

this distance is always smaller than the distance of the centre of the ellipse 
from the origin. 

The lengths of the semi-axes are 


AU 


t + 2 I 


t{t + l)^| 


Si- 


t+i 


a: 


\l{t+iy 


The ellipse E^ lies inside the ellipse E^, and touches it at the point where 


Xi, - X 


t-\~2 2a 
~t 


2a lt + 2 

~T’ 


8'^--{a-Xf 

t 


as long as 


S^{a-X) 


If the ellipse E^ is to be real, then we must have 

a-X 




s+^y 


From the upper limit a of the mean x of the sample it follows that 

t'j~2 


=i-|-2 a 
-A . — — — -t- 


t{t + l) 




^ -j- 1 ^ + 1 
This inequality leads to the relation 

,S<{a-Z)V(Hl). 

Finally, it is necessary to consider the condition that the ellipses E^ and E^ 
must not intersect, in the limiting case they touch at the point 
, t 2 a 


x = X- 


t t' 


^ + 2 qg oy ^ + 2 = „g f + l 

—SH2X-j^(a~X)~a^-^ 


0 . 
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From this condition there follows the limitation of the permissible values of 1, 8 
by the hyperbola ^aX t+l _ 

For carrying out the integration in formula (7) it is necessary to know tlie 
condition for the ellipse Ef^ to intersect the x axis at a distance A from the origin, i.e. 


Diagram 5 



The whole process of integration is quite clearly shown in the graphical 
representation of the individual segments of the permissible values X, S in Dia- 
grams 5-7, for < = 1, 2 and 3. It is convenient to split up the surface of the values 
X, 8, for which both the ellipses E^ and E^ exist, into two parts by a straight line 
X “ ^a. 

If l<^a, 

the semi-axis of the ellipse E-^ is longer than that of the ellipse E^. 

If 1>-K 

the ratio of the lengths of the semi-axes is the reciprocal. 

The description of the limits of integration for s is lengthy in tabular form 
and is better given by means of diagrams. 

In the special case, t= 1, the integration is carried out only according to x, 
for the values X, 8 (see Diagram 5) in the segment; 

(a) Between the limits; X-8f2; X-|-^S^/2. 

(/5) Between the limits; 

0; |I-^V(6S2-3l2) and lX + ^^{%8^~-U‘^)] X + 8^2. 
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(7) Between the limits: 

X-Sp; and + «■ 

{S) Between the limits: 

9; and ‘^lp+y{^s^-‘i{aXfy, 

3X ^ 

— ~ 3-X^} and |jJC + \j{Q8^ — 3X^}; a. 



As a simple illustration let us find the correlation surface F^{X, 8) of samples 
of t items drawn from the distribution 

f{x) = ~; 0<a;<ci 

Q/ 

over the region bounded by the straight lines 

« = 0. 


#-!)■ 

Using the results given by A. T. Craig, 

«XS) = i; = 

Uf U/ 

we find, by our formula (7), 


F,{X,8) = ~^-; 

Cv 


50 1/5 71^8^ 
1^’ 


Biometrika xxxi 


18 
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and then by similar reasoning as in example on page 262 we obtain 


F,{X>S) 


i~i t 


in ^ t^ 





This function is independent of the variable X. 

As to statistical theory, the problem of the correlation surface JJ(X, S) is 
solved by our formula (7) for any fundamental population/(a;), but as to applica- 
tion on a special distribution f(x), I have not overcome all the difficulties of 



integration. Nevertheless, I feel, the approach being a new one, this study may be 
of interest to statisticians and I hope perhaps that some mathematician will see 
how to solve the problems that I have left uncompleted. 


II 

Let us take the density of the simultaneous distribution of a mean x and 
variance = u, when the size of the sample is t, as 00, u). By an application of 
the same process as in Part I we get the fundamental recurrence relationship 
for the successive calculation of the values 00, u ) : 




\dxdu 


a 


( 8 ) 
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The corresponding initial values are expressed by the following relationship: 

0i(x,y)=4/(x+vw-vt')j 

For the calculation of the next value 0^{X, V) we need to carry out only one 
simple integration: 


G,{X, U) = 9 


/(S)/[P _ p - 2 m - C(Z - x)^}] 

x/[|Z - lx-y{QU- 2u - 6(Z - »)2}] 


dx 


V{6l7-2u-6(Z -*)“}■ 


The limits of integration for the different ranges of the fundamental universe 
can be deduced by the method given in Part I. At the same time the three ellipses 
Eq, E^, E^ are replaced by the three parabolae, ,Po> A A- 


t-\- 2 
“ 1 “ 


(S-Z)Htt 


t + 2 


U, 


... ,o^„v i,, H2 {t+l)(t + 2)^ 

— - — X -t{t + 2)xX + -u- -^U 2 


- « + 2 Z) + iw = a(i + 2) Z - a® 


In conclusion I must express my thanks to Professor E. S. Pearson for advice 
and several useful suggestions. 
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CERTAIN PROJECTIVE DEPTH AND BREADTH 
MEASUREMENTS OF THE FACIAL 
SKELETON IN MAN 

By ALETTE SCHREINER, Oslo 

1. Definitions of the measurements 

In their study of the “flatness” of the facial skeleton in man T. L. Woo & G. M. 
Morant (1934) lay down no method for direct measurement of the 'transverse 
flattening of the middle part of the facial skeleton, i.e. of the part made up by the 
malar bones and corpora of the maxillae. In studying a number of crania Avith 
differently shaped facial skeletons, it occuixed to me that the best expression of the 
degree of projection of the middle part of the facial skeleton might be obtained by 
expressing the projective distances of the zygomaxillary and zygo temporal points 
from the most posterior points on the margins of the pyriform aperture as per- 
centages of the lengths of the chords between the corresponding bilateral points. 

It is true that no precise points of general validity for this purpose can be 
indicated on the “nasolateral” margins. It is also the case that we often find 
asymmetry here, though hardly more than in other parts of the facial skeleton 
on which routine measurements are taken. However, the advantages offered by 
the most posterior points on the margins for subtense measurements appear to 
me to outweigh the disadvantages. It is a factor of some importance that the two 
subtense planes in question are almost coincident and approximately horizontal. 

After having taken some test measurements with a pair of ordinary co-ordinate 
calHpers, I came to the conclusion that this instrument was unsuitable for my 
purpose. I therefore decided to undertake a preliminary research with the object 
of testing the value of the method, disregarding the disturbing factor of asymmetry. 
For this purpose I designed, and ordered from P. Hermann of Zurich, a special 
pair of callipers with two parallel arms, both of which might be moved in directions 
at right angles to the bar to give readings of the distances of the tips from the bar, 
According to my design the tips were to be slightly blunt, and the bar was to be 
only 2 mm, thick with blunt edges on its working-face. Owing to the trouble and 
expense involved in the construction of an instrument of this kind, P. Hermann 
sent for my approval one of a set of twelve pairs of callipers which he had made some 
years ago to the order of R. Pooh, some of which were still in his possession (Fig. i). 
In most respects the design is similar to my own, but the bar is 3 mm. thick and the 
edges of the working-face are sharp. Furthermore, the scales of the arms are fixed 
in such a manner as to necessitate correction of the readings of the subtenses, 
whereby 4-3 mm. (controlled by the Weights and Measures Standards Office, 
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Oslo) must be added to the readings. In spite of these drawbacks I decided to keep 
the instrument, and; though it is not ideal for the purpose, it can be used effectively. 
I do not know for what kind of measuring it was originally designed. 

For measuring a chord and its subtejise I first set both arms of the instrument 
at equal lengths, longer than the subtense, and place the tips in contact with the 
extremities of the transverse chord. I then fasten the screw, draw back the arms, 
place the working-face of the bar hghtly but firmly against the nasolateral margins, 
and move the arms until the tips again meet the bilateral points. On removing the 
instrument the distance between the arms and the subtenses are recorded, the 
latter after addition of 4‘3 mm. If the readings on the two arms are not equal, as 



Fig. 1. Callipers used to measure the breadths and subtenses. 
(The figures on the scale are centimetres.) 


is most frequently the case, the average of the two values is recorded. For my 
method of measuring, however, I do not regard as suitable skulls with conspicuous 
asymmetry of the facial skeleton. 

With my instrument I also took the measurements which give the “frontal 
index of facial flatness” of Woo & Morant in such a manner that both arms were 
of the same, or -practically the same, length. This frequently necessitated re- 
measuring. I do not believe that the length of the subtense obtained by my method 
differs from that arrived at by using a pair of co-ordinate callipers of the usual form. 

My measurements are as follows: 

(i) The chord lOW, inner biorbital breadth, fmo-fmo of Martin (1928). 

(ii) Sub. low > the subtense of the nasion from the chord lOW. 

(iii) The chord OB, bimaxillary breadth, zm-zm of Martin. 

W 
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(iv) Sub. QB, the subtense of the nasolateralia (Nl-Nl), i.e. the line joining the 
most posterior points on the margins of the pyriform aperture, from the chord GB. 

(v) The chord ZB, bizygotemporal breadth, between the two zygotemporalia 
inferioria (ZT of Woo, 1937). 

(vi) 'Sub. ZB, the subtense of Nl-Nl from the chord ZB. 

(vii) FB, the bizygomatic breadth, zy-zy of Martin. 

Before passing on to the indices, I must comment on the fact that the location 
of the zygomaxillary point (zm) gave me a certain amount of trouble. I did not 
always adhere to Martin’s definition of it, which specifies absolutely the lowest 
point on the zygomatico-maxillary suture. Woo & Morant adopt this definition, 
adding that "if the inferior extremity of the suture is a short length lying parallel 
to the horizontal plane, the anterior point on it is the one accepted What is the 
position, however, if the length is not quite short, and if it does not lie quite parallel 
to the horizontal plane, as when the margin is rough and irregular! The fact is 
that we are dealing here with a region of the facial skeleton which reveals a par- 
ticularly high degree of variability, due primarily to differences in the mode of 
origin of the anterior part of the masseter muscle. In many cases the origin does 
not reach the maxilla, and where this is so location of the point presents no diffi- 
culty; but in other cases, and probably for some races in the majority of cases and 
particularly in males, the origin continues for different lengths on the lower border 
of the maxilla, and it often shows fairly strong impressions there. This last con- 
dition was frequently found in the case of the Norwegian skulls which I have 
examined, The same is true for the Eskimo and Australian specimens, but it was 
found much less frequently in the case of the skulls of Lapps and those of some other 
races. There appear to be racial differences in this respect. 

In some cases the lowest point on the suture lies rather far back on a broad 
and rugged surface. Such a point would appear to be quite useless for the purpose 
of measuring the projection of the facial skeleton, and in fact it may not always 
be possible to reach it with the tip of one of the arms of the callipers when the bar 
is placed on the nasolateral margins. In view of this I was compelled to draw up 
a new definition of the zygomaxillary point, viz. the lowest point of the zygo- 
matico-maxillary suture which still lies on the anterior surface of the bones. The 
adoption of this definition may occasionally alter the length of the chord OB by 
a small amount, but this is of little significance in comparison with the change 
which it will sometimes make in the length of the subtense. 

The definition of the zygotemporal point also seems rather uncertain. Woo 
(1937) defines ZT as “the lowest point on the zygomatic suture which is stiU on 
the lateral surface of the arch In some cases, however, it is very difficult to say 
where the lateral surface of the arch ends. The section of the arch shows consider- 
able variation in form. Rectangular sections which make possible an absolutely 
precise location of the point are somewhat scarce. More frequently the zygoma 
has a lateral and a latero-inferior surface which are more or less indistinctly separ- 
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ated from each other. In such cases I have located the point to the best of my 
ability on the borderline between the two. In most oases I have found the chord 
to be 2-3 mm. shorter than the greatest possible breadth between the sutures on 
the two sides. Although the question is of much less importance here than it is 
for the zygomaxillary point, a more precise definition is nevertheless desirable. 

Should the method which I suggest for measuring facial projections in relation 
to the nasolateral margins be generally adopted by craniologists, it will be necessary 
for them to find indisputable definitions to give the exact location of the extremities 
of the two chords in question. 



Eig. 2. A horizontal section of the facial skeleton illustrating measurements taken. 


The indices used can be divided into two classes, the first (Nos. viii-xi) in- 
volving subtenses and the second (Nos. xii-xvi) being ratios of pairs of the trans- 
verse breadths. They are: 


(via) SFi, 

(ix) SMi, 

(x) SZi, 

(xi) SSi, 

(xii) GOi, 
(xiii) ZOi, 

(xiv) GZi, 

(xv) OFi, 

(xvi) ZFi, 


100 Sub. lOWIlOW = 
100 Sub. QBjGB = 

100 Sub. ZBjZB = 

100 Sub. OB/Sub. ZB 
100 OBjlOW 
100 ZBIIOW 
100 OBjZB 
100 GBjFB 
100 ZBjFB 


frontal index of facial flatness, 
maxillary index of facial flatness, 
zygotemporal subtense index, 

maxfllo-orbital breadth index, 
zygomatico-orbital breadth index, 
maxillo-zygotemporal breadth index, 
maxillo'facial breadth index, 
zygomatico-facial breadth index. 


Finally, I have calculated approximately the angle between the vertical plane 
through the zygomaxillary and zygotemporal points on either side and the median 
sagittal plane (see Fig. 2). Assuming that the two planes zm-Nl-Nl-zm and 
ZT-Nl-Nl-ZT are horizontal and coplanar, the fraction {ZB-GB)ft{svh. ZB-sub. 
OB) is the tangent of the “zygomatic angle” [Zl). 
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A metrical description of the curvature of the inferior margin of the malar 
bone would be of interest, but my attempts to measure it have been unsuccessful. 
A survey of such a feature would appear to be possible only by using comphcated 
projective methods. 


2, The cranial series measured 

My main material consists of 100 male and 100 female Norwegian skulls from 
medieval churchyards in Oslo, and 100 male and 100 female skulls of Lapps ob- 
tained from various cemeteries in the county of Finmark, the majority of these 
being of fairly recent date. Most of the Lapp skulls form part of the material 
already dealt with by K. E. Schreiner (1931-5), but some of them have been 
acquired at a later date by the Anthropological Institute, Oslo. All of these are 
brachy cephalic specimens. The Oslo skulls form part of the material which has 
been described by the same author (1939). I have looked through his records and 
have omitted the small number of skulls with a cephalic index greater than 79-9, 
or an upper facial index less than 60-0. 

The series of foreign skulls in the possession of our Institute are all too small, 
and for the most part of too miscellaneous a nature, to provide results of any 
importance. I have, nevertheless, examined the beat of them for the purpose of 
obtaining comparative data. I measured twenty-five male and twenty-five female 
Eskimo skulls. The majority of these came from Greenland, but two of the male 
and six of the female specimens are from the opposite coast of Labrador. I have 
found no distinct differences between the two local groups. The other series which 
I measured are; 

Ten male and seven female Indian skulls from different parts of America, all 
of which have common features while none shows artificial deformation; 

Eleven male and eleven female Negro skulls from different parts of Africa, all 
being clearly dohohooephalic; 

Thirteen male and nine female native Austrafian skulls from different parts 
of the continent; 

Twelve male and nine female Maori skulls from a single cave on the North 
Island of New Zealand. 

The Australian and Maori skulls form part of the material dealt with by 
K. Wagner in his great work (1937). I have only included skulls sufficiently com- 
plete to provide all the measurements. 

3. Sexual comparisons 

Table I gives all the means which I have calculated. We will first examine the 
question of differences between the sexes. In Tables IIa and b sex ratios expressing 
the female means as percentages of the male are given for all the characters and 
groups, and in the lower sections of the same tables the differences between the 
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TABLE IIb 

Sex ratios {female meanjmale mean) and mean differences for the indices 
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means for the two sexes, together with the standard errors of these quantities, are 
provided for the Norwegian, Lapp and Eskimo series. As will be seen from Table 
IIa, all the male means of absolute measurements are greater than the corre- 
sponding female means, except in the case of 8ub. OB for the Negro series. Some 
of the differences for the three longest series are not statistically significant. This 
is so for Sub. lOW in the case of the Norwegian and Eskimo series, although the 
same difference for Lapps is markedly significant. This is an unexpected conclusion 
and no great reliance can be placed on it, particularly in view of the fact that 
K. E. Schreiner has found that in general sexual differences are smaller in Lapps 
than in Norwegians. The peculiarity noted is undoubtedly due to the mixed com- 
position of'the Lapp material, Our Lapps do not form a homogeneous population, 
being mixed at different places in different degrees with Norwegians and Einns 
(Quains). The skulls were collected in various localities spread over a wide area. 
Several of the component local series are small and the sexes are unequally 
represented in them. I have calculated the sex ratios of this character separately 
for all the local groups and have found fairly large differences. Some of the group 
means (Kautokeino, Karasjok and Kistrand) give low sex ratios, due, presumably, 
to the fact that the male means are too high to be characteristic of pure Lapps. 

The last column of Table TIa shows sexual comparisons of the zygomatic angle. 
In all series except the Negro the female mean is distinctly smaller than the male, 
although the difference is insignificant in the case of the Norwegian series. 

Sexual comparisons of the indices (Table IIb) are of greater interest. In accord- 
ance with the results of Woo & Morant, the frontal index of flatness {SFi) shows 
no significant differences, although it appears to have a slight tendency to be lower 
in the female sex. This is also true for the maxillary index of flatness (SMi), whereas 
the zygotemporal subtense index (SZi) shows a slight tendency to be greater in 
the female sex (cf. the relations of ZL). The means of the 88 index indicate, in 
accordance with those of the 8M index, that the maxillary region tends to be 
somewhat flatter in females than in males. The values of the different breadth 
indices show very small sex differences, but nevertheless they show fairly con- 
sistent relationships. In all groups the biorbital breadth is a little larger relative 
to the bimaxillary breadth in females than in males. The three breadth measure- 
ments of the middle part of the facial skeleton {OB, ZB and FB) show a slight 
tendency to be relatively larger anteriorly than posteriorly in female compared 
with male skulls (cf, Zl). In my material this relation applies particularly to the 
Eskimos. It may be noted that for their larger series Woo & Morant found a dis- 
tinctly lower mean than I have for the bimaxillary breadth (OB) in female Eskimo 
skulls. The sex ratio of this measurement in their material is only 92-7, as against 
95'2 for the biorbital breadth {lOW). 
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4- Racial compaeisons 

Of the breadth measurements, the internal biorbital (lOW) shows the smallest 
differences between the series and the bizygotemporal (ZB) the I^est. If the 
short Negro and Australian series are disregarded, the smallest means of all the 
breadth measurements are found to be the Norwegian. Woo & Morant give a 
slightly lower mean for the biorbital breadth, and a considerably lower mean 
(3-4 mm. less) for the bimaxillary breadth, for their nineteen Norwegian skulls 
than my values for the medieval Oslo series. The values found by these authors 
for male Anglo-Saxon and medieval English skulls are slightly higher for the 
biorbital breadth, and slightly lower for the bimaxillary breadth, than my 
Norwegian values. With regard to the two breadth measurements in question, 
there is little difference between the Oslo and Lapp series, particularly in the 
case of the male skulls. It may be noted that for the 140 male Lapp skulls 
K. E. Schreiner has given a mean bimaxillary breadth of 95-5 mm., which is 
0-8 mm. less than my value for male Lapp skulls and practically the same as my 
value for male Norwegian skulls. This author has also calculated for 121 female 
skulls a mean (91'7) which is slightly lower than mine. The difference between 
our values may be due to some extent to a difference m locating the zygomaxillary 
point. However that may be, the differences between my female means for 
Norwegian and Lapp skulls cannot be considered statistically significant in the 
case of either lOW (0-81 + 0-51) or GB (M3 i 0-56), The bizygotemporal breadth 
(ZB), however, is significantly greater in the Lapp than in the Oslo skulls. The 
means of all the breadth measurements of the middle facial skeleton are clearly 
greater for Eskimos than for Lapps, and the biorbital breadth shows differences 
of the same sign, though they are much smaller. It is worthy of note that this 
last breadth is greater, or at all events not smaller, in the remarkably narrow- 
headed Eskimos than in the broad-headed Lapps. The bimaxillary breadth is 
probably greater in Eskimos than in any other human race, and among the known 
peoples of the earth only the American Indians appear to approach them in this 
respect. 

Just as the breadth measurements increase from Norwegian to Lapp and 
from Lapp to Eskimo type, so, to a still greater degree, do the subtenses to ,10 If 
and OB decrease. Consequently the indices derived from these measurements 
show marked differences. My means of Bub. lOW and of the frontal index of 
flatness for male Oslo skulls are higher than those found by Woo & Morant for 
their Norwegian series, and they accord better with the values given by these 
authors for Swedes. On the other hand, their means of the two measurements 
for Eskimos.are greater than mine. 

With their exceedingly low means for the index of maxillary flatness (SMi), 
the Eskimo differs greatly from the other groups examined, including the Indian 
which otherwise bear some resemblance to the Eskimo. In fact, I know of no 
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characters more capable of indicating the peculiarity of the Eskimo skull than 
this index taken together with the chief cranial indices. Nevertheless, the 
Eskimos are extreme in nearly all characters, and the question remains of the 
extent to which the index can be considered of value for racial classification in 
general. I can only contribute a little towards the solution of this problem by 
comparing my Norwegian and Lapp skulls. The means of the index of maxillary 
flatness reveal the following differences: d2'56, $3'43, with standard errors of 
0-40 and 0-38, respectively. The differences are of marked significance. 

The zygotemporal subtense {Sub. ZB) and its index (SZi) show entirely 
different relations. They are associated less with the “flatness” of the facial 
skeleton than with the antero -posterior lengths of the calvaria, and facial skeleton. 
Among the groups which I have examined the Lapp skulls of both sexes have the 
smallest means for the subtense. The Eskimo values are distinctly higher for the 
subtense, but owing to its considerable zygotemporal breadth the type has the 
lower index. The Norwegian skulls have distinctly higher means, both for the 
absolute measurement and for the index, than Lapps and Eskimos, but their 
means are exceeded by those for the prognathous Negro and Australian skulls. 

The index SSi, which gives Svh. QB as a percentage of Sub. ZB, shows the 
highest means for the Oslo skulls and, as a matter of course, very low means 
for the Eskimo skulls. 

Among the indices which relate the different breadth measurements to one 
another, the maxillo-orbital breadth index (QOi) would appear to be of value for 
racial classification, as it appears to differentiate families of races. Judging 
from the means given by Woo & Morant for the biorbital and bimaxillary breadths 
the index lies below 100— that is to say, the former is greater than the latter 
breadth — in all European, southern Asiatic, the Australian and the majority of 
African populations, while it exceeds 1 00 in the case of eastern Asiatic, the maj ority 
of American, and a few African and Oceanic populations. As regards my material, 
the Lapps appear to deny their presumed Mongolian origin, since their values for 
this index do not exceed the Norwegian to a significant extent, while the Eskimos 
and Indians have means distinctly above 100. 

The ZO index rises considerably on passing from the Norwegian to the Lapp 
and then to the Eskimo series. As regards the other indices of breadth, I will 
merely refer to the low values for the ratio of the bimaxillary to the facial breadth 
{QFi) of Lapps, who are not only distinguished by their weak mandibles, but also 
by their weak maxillary bones in contrast to those of Eskimos. 

Einally, something must be said of the zygomatic angle. As the calculation of 
this is based on four different measurements, its value may be influenced by as 
many errors. Moreover, the two subtenses which are used do not lie in exactly 
the same plane. Even granting these defects, the angle, nevertheless, is fairly 
expressive as illustrating a feature which is not measured by the indices. I have 
previously dealt with the sexual differences, and will here confine myself to calling 
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attention to the difference between the means for Lapps and Eskimos. Both 
these types are characterized by broad and fiat faces with what are called “high 
cheek hones The low values for Eskimos may appear to be unreliable, but doubts 
as to this will disappear on inspection of the skull of an Eskimo from the base. 
The unusual breadth of the Eskimo face is due partly to the great bimaxillary 
breadth, and partly to the considerable curvature of the malar bones, but the 
posterior parts of these bones are long and the zygomatic arches only protrude 
slightly in a lateral direction. The bimaxillary breadth of the Lapps, on the other 
hand, is distinctly smaller, the malar bones are rather short in a transverse 
direction and the arches protrude much more laterally. 


5. Vaeiabilitibs 

In Tables IIIa and b will be found the standard deviations, while Tables 
IVa and b give the coefficients of variation, for the Norwegian, Lapp and Eskimo 
series. The bimaxillary {GB) is the most variable breadth and the bizygotemporal 
(ZB) is the least variable, while the bimaxillary is the most variable subtense- 
judging by coefficients of variation — and the bizygotemporal is the least variable. 
In the case of the subtense indices, also, the index of maxillary flatness {SMi) 
varies most, while the zygotemporal index varies least. These relations are 
undoubtedly due to the great variation in the form of the facial skeleton in the 
region of the zygomaxillary suture, which affects all the indices involving the 
bimaxillary breadth. As a test of sexual and racial differences in variability I am 
restricted to my Oslo and Lapp material. With regard to this question, I am 
bound to admit that, as long as we cannot count upon absolute accuracy in sexing 
skulls, very little emphasis can be laid upon any difference found. In my material 
male variability tends on the whole to be shghtly greater than female, both as 
regards absolute measurements and indices, except in the case of Sub. lOW. 

Table V gives the average coefficients of variation in the two male and two 
female series for three breadth measurements {lOW, GB and ZB), for the three 
corresponding subtenses, for the three corresponding subtense indices (Nos. 
viii“X), and for the three corresponding breadth indices (Nos. xii-xiv). Averages 
are also given for all six absolute measurements, for all six indices and, finally, 
for all twelve characters. At the bottom of the table the corresponding averages 
are recorded for the male plus the female constants. It will be observed that most 
of the male averages slightly exceed the female. As regards racial differences, the 
average for all six absolute measurements is greater for Lapps, but the average 
for all six indices is greater for Norwegians. All the differences between the 
averages for the two series are, however, very small. 
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TABLE IVa 

Coefficients of variation for the absolute measurements 
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rH (M rH 

^ p X 

6 6 6 

4-43 + 0-31 
4-06 + 0-29 
4-02 + 0-57 

ZOi 

iQ \Q tH 
(M p 

6 6 6 
+1 +1 +1 

05 05 CD 

p p p 

6 6 6 

EgH 

GOi 

CD CD CO 
rt X iQ 

6 6 6 
+1 -Hi -H 

0 CD 

0 rH 

6 6 6 

CO .H 0 

p CO 

6 6 6 
-M -H -H 

CO CO Tfi 

CO X 05 

5Q 

8SS 

rH 6 6 

+1 +1 -HI 

iQ CD 00 
^ p p 

6 rH tH 
rH rH Cq 

X cq Cr 

X OS CO 

6 6 6 
-ft’ -ft -fl 

CO N Tf( 

»Q 0 1- 

6 6 6 
rH pH rH 

SZi 

CD » VQ 

IQ IQ M 

6 6^ 

-f 1 +1 +1 

X ^ l> 

05 W p 
i> X 6' 

7-30 + 0-52 
6-96 ±0-49 
5-66 ±0-80 

• e* 

13- 84 +0-98 

14- 93 + 106 
24-48 + 3-46 

13- 63+0-96 

14- 63+1-03 
16-18 + 2-28 

SFi 

0 »Q X 

X I> 05 
66 rH 
+1 +1 +1 

IQ cq 
(M CD 0 
rH 6 
rH rH iH 

11- 97+0-85 
10-69+0-76 

12- 24+1-73 

Groups 

Norwegian 

Lapp 

Eskimo 

Norwegian 

Lapp 

Eskimo 

Sex 

'TD 

Of 
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TABLE V 

Average coefficients of variation for different sets of characters 


Series 

3 Breadths 

3 Sub- 
tenses 

3 Suht. 
indices 

3Br. 

indices 

6 Absolute 
measm-e- 
ments 

6 Indices 

All 12 
characters 

Norwegian 

4’05 


11-02 

4-68 

7-69 

7-88 

7-77 

0 

n d' 

3-89 

■IBM 

10-97 

4-11 

7-62 

7-64 


LappeJ 

4'92 

11-42 

11-27 

4-20 

8-17 

7-74 

H9 

» ? 

3'87 

11-68 

10-76 

3-98 

7-68 

7'.37 

7-52 

Norwegian + Lapp J 

4-49 

11-38 

11-15 

4-44 

7-93 

7-79 

7-86 

„ 

3'78 

11-41 

10-87 

4-05 

7-60 

7-46 

7-53 

Norwegian + 

3'97 

11-24 

11-00 

4-40 

7-65 


7-65 

Lapp(f+? 

4‘3() 

11-66 


4-09 

7-93 

7-66 

7-74 


(i. Conclusions 

The chief result yielded by my study is that measurements of the subtenses 
from the most posterior points on the margins of the pyriform aperture to certain 
other bilateral points would be a valuable addition to the routine technique 
followed in describing series of skulls. Of these measurements the maxillary 
index of facial flatness appears to be the most useful, but the maxillo-orbital 
breadth index also gives comparisons of considerable interest, The adoption of 
these characters in craniological research will necessitate more precise definitions 
of the zygomaxillary and zygoteraporal points than those used hitherto. Further- 
more, a new instrument should be designed for measuring projections from the 
nasolateral margins. 
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TRANSPOSITION OP THE VISCERA AND OTHER 
REVERSALS OE SYMMETRY IN MONOZYGOTIC 

TWINS 

By E. a. COCKAYNE, D.M., E.R.C.P. 

The twins, Eileen and Joan C., were first brought to my notice by I)r Reginald 
Lightwood, who saw them when they were six years old at the Hospital for 
Sick Children, Great Ormond Street, and found that one had dextrocardia and 
the other was normal. He remembered that they were very much alike and 
thought they were monozygotic, With some difficulty I got into touch with 
them, and Dr James Graham, Assistant County Medical Officer, Essex C.O., 
kindly examined the parents and the surviving brothers and sisters and found 
that in all of them the heart was in the normal position and no cardiac abnor- 
mality was present, The other children are: Eric, aged 21; Albert, aged 18; 
Doris, aged 16.; Gladys, aged 12; Betty, aged 11; John, aged 9; and Iris, aged 
6 years. One boy died in 1919, aged 3|-, of lymphatic leukaemia. The twins, 
born in August 1924, were 13 when examined. The parents are English and are 
not blood relations. No other case of transposition of the viscera is. known to 
have occurred in the family. 

The mother says the twins weighed 3 lb. at 'birth and that there were two 
afterbirths, i.e. each had a separate .placenta. Clinically the heart is on the right 
in Eileen and on the left in Joan, and the size and sounds are normal in both. 
An electrocardiogram of Eileen taken by Dr J. L. Lovibond at the Middlesex 
Hospital shows inversion of all waves in lead 1, but that of Joan is normal (see 
pp, 290, 291). An X-ray of the chest and abdomen and a barium swallow showed 
dextrocardia with the stomach on the right and the liver on the left side in Eileen, 
and the normal position of the viscera in Joan. 

The twins are very much alike in appearance and, though it is possible to 
distinguish one from the other when together, there is very little doubt that they 
are monozygotic. In view of the rarity of such a mirror image condition a 
number of confirmatory observations were carried out. Eileen is right-handed, 
while Joan is left-handed. Their handwriting is very much alike, but is unformed, 
and it is difficult to say how far the resemblance is due to teaching. B, R, and r 
are formed in exactly the same way in both. 

Miss Ida Mann examined them and made the following report : 

Eileen. Visual acuity 6/6 right and left. Hypermetropia in the right eye and 
hypermetropic astigmatism in the left. In both eyes the error is slightly higher 
than in her twin’s. Retinal arterial pattern dissimilar in right and left eyes and 
does not resemble her twin’s ; practically no mesodermal pigment, well marked 
lesser circle and no remains of pupillary membrane. The right eye is the master eye. 
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Joan. Visual .acuity 6/6 right and left. Low-grade hypermetropic astig- 
matism in both eyes. Axes oblique. Retinal arterial pattern dissimilar in right 
and left eyes and does not resemble her twin’s. Iris pattern the same in both eyes 
and exactly similar to her twin’s, practically no mesodermal pigment, well 
marked lesser circle and no remains of pupillary membrane. The left eye is the 
master eye. 

Miss Mann says that it is usual to find the retinal arterial pattern different 
in the two eyes of the same person and in corresponding or opposite eyes of mono- 
zygotic twins. According to Viggo Eskelund no two persons show an identical iris 
pattern, but a broad classification into types is possible and these types are 
determined genetically. Unfortunately the iris pattern in the twins is a very 
common one. 

Dr Phyllis Kerridge tested their hearing in a sound-proof room and both 
gave very similar graphs, bone conduction being normal and air conduction low 
normal. 

Dr G. M. Moraut took the following measurements in millimetres : 


Eileon Joan 

Sfcaturs 1429 1437 

Left cubit 396 396 

Right cubit 397 396 

Maximum head length 180 184 

Maximum head breadth 136 136 

Head height (an uncertain measurement) 128 126 

Oephalio index 76-6 73‘9 

Bizygomatic breadth 121 121 


R. L. 11. L, 

Maximum length of ear 56 58 55 57 


Length of phalanges with fingers strongly flexed at metacarpophalangeal joints: 


2nd digit 

92 

92 

94 

93 

3rd digit 

99 

101 

103 

102 

4tli digit 

93 

95 

98 

98 

5th digit 

76 

76 

76 

78 


The girls are about 2^ in. below the English average stature for their age and 
social class, The small differences between measurements suggest that they are 
monozygotic. The small difference between head lengths (leading to one between 
the oephalio indices) is probably the most significant. Joan has slightly longer 
digits than Eileen. As all the measurements are subject to errors of at least 
1 mm. it is safest to conclude that they do not provide any evidence of asym- 
metry. 

He also took photographs and reported as follows : The photographs are of 
three kinds. 

(I) Profiles. The outlines of the faces of the two girls are remarkably similar, 
their upper lips being unusually short. Eileen’s chin is rather longer and 
straighter than J can’s. No difference between the ears (not all visible) was noted. 
Both have free lobes. They provide no. evidence of asymmetry. 

(II) Full-face. The photographs are not truly full-face, rather more of the 



289 


E. A., Cockayne 

right than of the left being shown in each ease. The outlines are very similar, 
Joan having a rather shorter and more pointed chin than Eileen. The teeth 
show a slight anomaly, which is the same in both girls. The central upper incisors 
are crossed, the left overlapping the right. 

A palmaris longus muscle is present in both twins on both sides. Hair colour 
in both twins matches scale 7. The hair whorl was clockwise in both twins. 

Miss David took fingerprints and made the following report: 

A glance at Waite’s tables shows that radial loops are nearly as common as 
ulnar loops on the forefinger. The remaining fingers are all ulnar loops, which are 
the commonest of all patterns. There is a distinct similarity between the counts 
of these ulnar loops and in the actual pattern. In both twins the actual counts 
approximate to the median value of the distance of ridges in loops. The thumbs 
are interesting and at first sight appear very similar, but there is a difference 
as shown in the counts, and the right thumb of Eileen is like the left thumb of 
Joan, while the left thumb of Eileen is like the right thumb of Joan. The most 
striking thing is that Eileen has a radial loop on the forefinger of the right hand, 
and Joan has a radial loop on that of her left hand, and Eileen has an ulnar loop 
on the forefinger of the left hand and Joan has an ulnar loop on that of the 
right hand. 



Eight hand 

Left hand 


Joan 

Eileen 

Joan 

Eileen 

T 

23 C 9 

2106 

6 0 20 

110 24 

1 

UL 13 

EL 5 

KL 5 

UL 13 

2 

ULI3 

UL14 

UL12 

UL13 

3 

UL12 

UL12 

UL12 

UL13 

4 

UL13 

UL 12 

UL 13 

UL 10 


UL= Ulnar loop. EL = radial loop. 0= composite. 
The numbers are the number of ridges. 


The palmar and plantar patterns are similar, but those of Eileen’s left side are 
more like those of Joan’s right side than those of her own right side and vice versa. 

Dr G. L. Taylor finds that the blood groups in both twins are 0 M N. Both 
are able to taste phenyl-thio-carbamide. The other taste tests were unsatisfactory 
because they were unable to discriminate clearly between sour and bitter. 

At school the twins were in the same class and much alike in mentality, 
Questioned in different rooms with no opportunity of overhearing one another or 
comparing notes each said that her favourite colour was blue, one immediately 
and the other after some hesitation. Neither knew her sister’s favourite colour. 

I think the resemblance in general appearance and the agreement in so many 
independent characters, a number of which are known to be determined genetic- 
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Eileen. Electrocardiograms. 
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ally, proves that Joan and Eileen are monozygotic. They differ from the majority 
of such twins in the extent to which they are mirror images of one another: 

Eileen with complete transposition of the viscera. 

Joan with normal viscera. 

Eileen right-handed with the right eye the master eye. 

Joan left-handed with the left eye the master eye. 

Eileen with a radial loop with five ridges on the right forefinger. 

Joan with an radial loop with five ridges on the left forefinger. 

Eileen with an ulnar loop with thirteen ridges on the left forefinger. 

Joan with an ulnar loop with thirteen ridges on the right forefinger. 

Eileen with thumb ridges of the right hand closely resembling those of Joan’s 
left hand, and with thumb ridges of the left hand closely resembling those of 
Joan’s right hand. 

I can find records of only four similar cases in the literature. Baron (1826) 
showed one before the Section of Medicine of the Academy of Medicine, Paris, 
in December 1825. A short report says that the twin with transposed viscera 
was a girl, who died at the age of 8 days. Kuchenmeister states that the twins 
were monozygotic and that one had normal viscera and the other complete 
transposition. Miller (1893) records a case of monozygotic twins, one with situs 
inversus viscerum and' the other with situs solitus, but gives neither their age 
nor sex. Dubreuil-Chambardel (1927) records male twins, aged 26, identical in 
general appearance, in weight, height, and in physical characters. Each had 
varices on the lower limbs of the same type and developing in the same way. 
One had a harelip on the left, the other on the right, and one had complete 
transposition of the viscera, while the other was normal in this respect. The 
visceral condition was confirmed by radiological examination. I have been unable 
to refer to the original account of the twins recorded by Araki (1934), but Taku 
Komai, though he gives neither their age nor sex, says they were monozygotic 
and that one had normal viscera and the other complete transposition. Taku 
Komai says that this is the fifth case recorded. Possibly he includes that re- 
corded by Tamm and later by Betschler. This refers to female twins, stillborn at 
the seventh month, oire normal, the other oedematous and hydrocephalic, with 
the heart and aorta on tlie right side, no lung on the right and a small one on the 
left. The other viscera were correctly placed, but the liver was small and the 
kidneys large. There was one placenta with two cords implanted centrally with 
the umbilical veins anastomosing. 

Newman in his Biology of Tunns shows that the armadillo, Dasypus novem- 
cinctm, normally produces monozygotic quadruplets. He makes the following 
observations. At first there are two embryos and later a secondary embryo is 
formed to the left of each primary embryo. Asymmetry of the scutes is not 
uncommon and in the quadruplets all grades of mirror image formation are 
found. There may be mirror-imaging between individuals of opposite pairs 
(primary embryos), but this is much less common than imaging between twin 
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partners derived from one half of the egg (a primary embryo and the secondary 
embryo to its left). In general mirror-imaging between “opposites” is evidence 
of a residuum of a primary bilateral symmetry that held sway in the blastocyst 
before polyembryonic fission began. When the primary outgrowths are formed, 
they are the product of the antimeric halves of the first embryo and should 
therefore show mirror image relations. But a partial physiological isolation of 
the two halves permits a certain reorganization or regulation of new symmetry 
relations, which tends more or less completely to destroy the original symmetry, 
yet often leaving a trace of the latter. Finally when each secondary outgrowth 
organizes its own bilateral symmetry, it tends to lose, partially at least, the 
earlier symmetry relations, and to establish its own mirror-imaging of right and 
left halves. In some cases traces of all three symrnetry systems appear in a single 
set of foetuses, but it is common to fihid only two systems interacting. Newman 
says that no case of visceral reversal was found in spite of a careful search 
through a large number of foetuses. This, he thinks, is due to the fact that 
twinning is initiated and carried out in the ectoderm, and that the endoderm 
becomes involved only passively and considerably later. 

In the case of human monozygotic twins there may be almost complete 
mirror image arrangement of palmar patterns so that the left hand of x corre- 
sponds with the right hand of y and vice versa. This corresponds with the rather 
rare mirror-imaging of “opposites” seen in armadillo quadruplets and in 
Newman’s opinion goes far to prove that polyembryony actually occurs in man. 
A commoner manifestation in man.is reversal of symmetry in the pattern of an 
index finger of one twin so that it mirrors the condition in the corresponding 
index finger of the other twin. This reversal of pattern occurs in both hands of 
Eileen and Joan. 

In addition to the ridges on the skin other epidermal structures may show a 
mirror image arrangement, such as naevi and angiomata, and Siemens (1924) 
records an accessory nipple below the normal one, which was on the right side 
in one and on the left side in the other twin. 

Newman believes that monozygotic human twins become physiologically 
isolated at a considerably earlier period than do armadillo quadruplets and 
founds his belief on the fact that there is so little mirror-imaging in the former 
and so much in the latter. He says that as a general rule the earlier the separa- 
tion the more complete is the reorganization of the symmetry relations in the 
separate individuals and the less residuum of the original common symmetry. 

In double monsters of motfozygotic origin there is, according to Dubreuil- 
Chambardel (1927), frequently a hare-lip, which is axial in both or lateral in 
both, and as a rule it is of the same degree, and the hare-lip of one component is 
always a mirror image of the hare-lip of the other. He points out as many others 
have done how often one component of a double monster has complete or partial 
transposition of the viscera, the arrangement being that the apices of the two 
hearts point away from the place where the twins are attached to one another, 
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There can he no doubt that neither the hare-lip nor the transposition of viscera 
in double monsters is of genetic origin. Hare-lip in man is sometimes determined 
genetically, but the incidence in double monsters is far too high for this to be 
true in their case. Newman says that in the case of double monsters fission must 
begin much later than usual and is never completed. While in the armadillo 
polyembryony is the rule and fission is standardized, in man it is the exception 
and there is considerable variability in the period at which fission takes place. 
Transposition of the viscera in man is recessive to the normal arrangement, and 
elsewhere I have suggested (1938), two ways in which it could affect only one 
member of a pair of monozygotic twins. Apart from the improbability of so un- 
usual an occurrence happening in at least five recorded cases, neither explanation 
accounts for the occurrence of other mirror image arrangements, and any genetic 
explanation must be abandoned. If Newman’s views about mirror image forma- 
tion are accepted, the very rare cases of human monozygotic twins with a 
mirror image arrangement of the viscera in addition to that of ectodermal 
structures are due to an unusually late fission. The fact that they are so rare 
seems to show that unless fission is very late each twin reorganizes its own 
symmetry and, if genetically normal, the viscera are normal in both, but if the 
zygote is a homozygous recessive for transposition of the viscera, the viscera are 
transposed in both, Apparently few twins in which late fission occurs succeed in 
becoming completely separate, for double monsters witli their own mirror image 
symmetry, i,e. with the viscera transposed in one and normal in the other, are 
far commoner than separate twins with this arrangement. The separate twins, 
which do show this arrangement have in fact just escaped being double monsters. 

My thanks are due to Prof. R. A. Fisher for kindly putting the resources of the 
Galton Laboratory at my service and to Dr Julia Bell for her help and advice, 
and to Miss David, Dr Taylor and Dr Morant. I wish also to express my gratitude 
to Miss Ida Mann and to Dr Lovibond. Dr Graham had intended to write a 
report on the twins, and I should like to thank him for allowing me to do so 
instead and for examining the other members of the family. 
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THE HUMAN REMAINS OF THE IRON AGE AND OTHER 
PERIODS FROM MAIDEN CASTLE, DORSET 

By C. N. GOODMAN and G. M. MORANT 


1. Introduction 

Excavations at Maiden Castle, the largest fortified site of prehistoric date in 
England, were conducted under the direction of Dr R. E. M. Wheeler during 
four consecutive seasons from 1934 to 1937. The discoveries made were of unusual 
archaeological interest, and they include a number of skeletal remains which 
were preserved with unusual care. The bones available represent 104 individuals 
ranging in age from a foetal to a senile stage of development. This paper is a 
report on the eighty-three skeletons which are sufficiently well preserved to be 
of some anthropological interest, and the periods, sexes and age groups to which 
they are assigned are given in Table I. An unpublished report on the excavations 
by Dr Wheeler contains particulars of each of the 104 individuals, and a section 
by the present writers in the same work gives a summary of the general con- 
clusions reached below and fuller descriptions of the mutilated specimens. 


TABLE I 

The periods, sexes and ages of individuals from Maiden Castle whose 
skeletons were measured 



* Including one individual (T 28) classed as Iron Age C or Romano-British. , 


The metrical study of the skeletons made hitherto has been confined to the 
crania and mandibles, on which all the customary measurements have been 
taken, and to the clavicles and long bones of the arms and legs, described by 
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lengths only. Individual measurements and remarks for the crania and long 
hones are given in the appended Tables VII and VIII; remarks on the mandibles 
are given with those on the crania, but the individual measurements of these 
bones are not provided, Contours of the crania have been drawn, but they are 
not used here, as additional English material of the same periods would be 
needed to furnish sufficiently reliable type figures. The writers hope to undertake 
a more detailed examination of the long bones and an examination of the other 
bones of the skeleton. 

TABLE II 


Maximum lengths of the shafts of long bones of infants of Iron Age 
date and of modern foetuses 


No. 

Year excavated ... 
Period* 

CiMl 

19.S6 

A 

Pill 

1937 

A 

QII 

1937 

A 


PI 

1937 

B 

RII 

1937 

B 

T19 

1937 

B 

0 

1936 

Of 

P 3.6 
1937 
Cf 

Modern§ 

Modern|| 

Femora 

11. 


75-1 


78-9 






664 

68'0 


L, 

— 

75-1 

— 

— 

85-2 

76-9 

724 

81-5 

— 


68'0 

Tibiae 

R. 



64-() 

— 

68-2 



67-0 

68-2 



664 

5b0 


L, 

— 

C4'2 

— 

68-8 

— 

674 

68-8 

_ 

724 

56'6 

614 

Fibulae 

11. 

— 

61d) 








68'0 


... 

48'2 


L. 

— 

61-2 

— 

— 

— 

__ 


— 

— 


48'2 

Humeri 

R. 

7l)-C 

64-1 

— 

60'0 

74-0 

67'0 


714 


.684 

634 


L, 

70-7 

64-3 

75'5 


— 

— 

— 

714 

73'0 

68-2 

53'0 

Radii 

R. 


49'0 




.68-2 

524 

51'8 

i)5’6 




44'0 


L. 

— 

494 

— 


.68-() 

— 

51'2 

56'7 


45'0 

44'0 

Ulnae 

R. 

644 

.78'2 

— 


66-2 

614 

— 

- 



49’0 


L. 

— 

M'O 

— 

614 

— 

— 

68'8 

64'5 

— 

53-0 

49'0 

Clavieleis 

R, 



46-() 



B 

48'3 

— 

424 

494 

464 


34'6 


L. 

47’0 

46-0 

— 

H 

484 

— 

42-6 

— 

— 

38-8 

34'0 


* A, B iind C rcfur to tho uhronologioal divisions of the Iron Age. I Belgic. 
I Bolgic War Cemetery, § Full term. |1 7 months foetus. 


Nearly all the skeletons are incomplete, but in the majority of cases the sexes 
of the adult and adolescent individuals can be judged, principally from the 
pelvis, with a reasonable assurance of accuracy. It was suspected at first that 
nine skeletons, which are remarkably well preserved in view of their tender age, 
were foetal. The maximum lengths of the shafts of the long bones for these— the 
measurements having been taken in any direction by using the flat arms of a 
pair of small calipers— are given in Table 11. The last two columns on the right 
of the table give the same lengths for a child at birth and a seven months foetus. 
These two specimens are preserved in the museum of -the Department of 
Anatomy, University College, London, and we are indebted to Dr Matthew 
Young for giving us access to them. All nine of the skeletons from Maiden Castle 
are appreciably larger than the two of modern date, and hence it must be 
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supposed that the former are of young infants, probably all from three to six 
months old. No other measurements of them were taken. 

The remaining seyenty-four skeletons are available for racial comparisons, 
and the majority of these are of Iron Age date. Thirty represent the population 
of the site at the very end of the period, as they came from a cemetery used for 
the Belgic defenders of the Castle who were slain by Roman invaders in the year 
A.D. 43. Nearly half of the skulls in this series bear witness to the event in the 
form of sword-cuts and other mutilations. The Belgic War Cemetery series is 
just long enough to make statistical comparisons between it and other groups 
worth while, but all the other groups distinguished in Table I are clearly too 
small to stand alone. In some comparisons below a composite series made up by 
pooling all the other individuals of Iron Age or Romano-British date is con- 
sidered, either apart from the Belgic War Cemetery series or combined with it. 
This procedure is of provisional value only, of course. There may have been 
changes in the racial composition of the population of Maiden Castle during the 
Iron Age and again in Roman times, and the evidence available is quite in- 
sufficient to show whether this was so or not. 

It should be appreciated, too, that it is not unlikely that a small community, 
which may have been peculiar owing to relative inbreeding, may have persisted 
there for several generations, and the characteristics of such a local group may 
mislead if it is taken to typify the large racial population of which it formed a 
special part. The new material from Maiden Castle provides a very welcome 
addition to our meagre knowledge of the physical characters of the inhabitants 
of England in Iron Age times. It may be hoped that it will form a nucleus to 
which other specimens of the period — both those already housed in museums 
and others as yet undiscovered — may be added, until the evidence is abundant 
enough to satisfy the most exacting anthropologist. The object of this paper is to 
provide a descriptive record of the skeletons available with such a hope in view, 
and any results regarding racial relationships made in it are intended to be of 
a tentative nature only. 


2. Non-mbteioal beatuebs OB THE Maiden Castlb sbeies 

The remarks on the Iron Age and Romano-British series given in this section 
relate almost entirely to the skulls. Comments on a few of the long bones which 
exhibit gross abnormalities are given in Table VIII, and the remaining parts of 
the skeletons have not been examined. The Neohthio skeleton of a young man 
(Q 1) was extensively mutilated, and otherwise it is not remarkable. The skull- 
cap of the same period is also that of a young man. The Saxon skeleton (Q) is 
male but the age at death cannot be estimated as the skull is missing. The 
following remarks refer to the reinaining specimens, which are all of Iron Age 
or Romano-British date. 
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Estimates of the ages at death of the adults were obtained by noting the 
conditions of the coronal, sagittal and larabdoid sutures, and the ectocranial 
aspects of these give the following frequencies ; 



All sutures 
open 

Sutures begin- 
ning to close or 
partly closed 

All sutures 
closed 

1 

Totals 

■i 

Oelgic War Cemetery aeries 

6 (28%) 

12 (67%) 

1(6%) 

18 


Others 

3(21%) 

9(64%) 

2(14%) 

14 


Total series 

8(26%) 

21 (66%) 

3 (9%) 

32 

5 

llelgic War Cemetery series 

3 (43%) 

4(57%) 





Others 

7(37%) 

10 (.5,3%) 

2(11%) 



Total series 

10 (.38%) 

14 (.54%,) 

2(8%) 



It is probable that all the people buried in the Belgic War Cemetery were 
massacred. The age constitutions of the short adult series are very similar, 
however, to those of the series made up by other people interred at Maiden 
Castle, who probably died from natural causes. The massacre must have been 
carried out without regard to age, though there appears to have been some sex 
discrimination, since twice as many men as women were excavated from the 
cemetery (see Table I). The percentage frequencies for the total series show the 
usual sex differences, due to the fact that the sutures tend to close at an earlier 
age in males than in females. Comparison with values obtained in the same way 
for other series (see Risdon, 1939, p. 107) shows that the Iron Age men died at 
a rather younger age, on the average, than those interred in English cemeteries 
of a later date, but that the women are not distinguished in the same way. In 
considering the frequencies of anomalies, the total series from Maiden Castle is 
referred to below. 

The sagittal suture was normally the first to close, followed by the coronal 
and then the lambdoid. Only one specimen (P 23, male) is definitely anomalous 
with regard to the condition of the principal calvarial sutures. Its sagittal 
suture is obliterated, the coronal is beginning to close and thei lambdoid is open. 
This specimen shows no apparent sign of distortion and its cephalic index (77-4) 
is above the average, so it may be assumed that the sagittal suture was not 
obliterated before maturity was reached. Of thirty-three male frontal bones 
only one (P ii, an isolated bone for which no measurements are given) is metopic, 
and there is only one female metopic specimen out of a total of twenty-seven 
frontal bones. Among Western European crania metopic specimens are usually 
found with a frequency of about 10 %, but the Iron Age series are so short that 
no significance can be attached to the fact that the frequencies for them are 
exceptionally low. There are no examples of interparietal bones, or of an os 
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ipmtal, among the thirty-two male and twenty-six female occipital bones. A 
much rarer anomaly is exhibited by a male cranium (P 7 A). This is a trace of 
a suture (33 mm. long) extending across the posterior part of the right parietal 
bone: its position can be seen from Plate IIB. One male skull (P 37) shows the 
oooipito-maatoid suture obliterated prematurely on both sides. There is one case 
of fronto-temporal articulation on both sides among the males, and one of the 
pterion in 11 on both sides among the females. One female specimen has both 
malar bones divided by horizontal sutures. A male skull (0 3, Plate IIIB) has 
the basi-occipital broken exposing the extension into it of the sphenoidal air 
sinuses, the condition exhibited by the Swanscombe skull. There is one large 
cavity and the extremity of a small one to the right of it. 

Wounds that had healed completely during life were noted on the cranial 
vaults of three of the male and two of the female skulls. None of these was 
severe and a female specimen (T 22) has a depression on the right temple which 
was more serious than the other injuries. An adolescent female skull (P 20, 
Plate II A) has the right malar bone deformed, probably as the result of a wound. 
By far the most serious injuries of traumatic origin are shown by the long bones 
of one of the males (T 5). His right elbow, was shattered, involving complete 
separation of the proximal extremity of the ulna, his left radius was fractured 
near the wrist, and his fractured left fibula became fused to the tibia. Healed 
fractures of the long bones were only found in the case of two other skeletons, 
viz. P 23 (male, riglit ulna) and P 36 (female, left fibula). Pathological conditions 
of bones not of traumatic or dental origin were only noted in the case of one male 
skull (T 9) which has a swelling on the right maxilla below the orbit which was 
probably caused by a tumour, of a skeleton of the same sex (P 25) which has 
osseous deposits on the muscular ridges of the left femur and fibula, and of a 
female (T 12) which has the left humerus markedly deformed, probably as the 
result of an inflammatory condition of the surrounding soft tissues. We are 
indebted to Dr A. M. El Batrawi for help in interpreting the conditions of these 
and other specimens in the series. 

The teeth of the individuals interred at Maiden Castle are by no means well 
preserved. The following. frequencies are found for the total “Iron Age” series: 



Upper jaw 

Lower.jaw 


d 

? 

d 

? 

No teeth lost before death 

12 

m 

22 

13 

One or more teeth lost before death 

16 


10 

12 


In the case of excavated series of skulls which are not of recent date, it is 
customary to find for either sex that more than half the specimens had lost no 
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teeth from the upper jaw before death. Abscess cavities were found in several 
of the jaws. A female specimen (T 21, Plate HI A) has two symmetrically dis- 
posed cyst cavities in the buccal surfaces of the maxillae at the roots of the 
second premolars. There are examples of crowded front teeth. A male mandible 
(P 23) has a milk molar retained in the position of the left second premolar, 
which had not appeared, and a female bone (P 26) appears to have had only three 
incisors erupted. The upper jaw of a female skull (T 28, Plate HID) shows the 
canines missing and the first premolars rotated. A female mandible (P 14, 
Plate IIIB) has the left third molar impacted but almost fully erupted: there are 
also distinct swellings on the inner alveolar margin extending from the first 
premolar to the first molar on the left side, and from the second premolar to the 
first molar on the right. Several of the Sinanthrop^is mandibles exhibit this 
condition {torvs mandibularis), and it occurs more frequently, and to a more 
marked extent, in some modern races of man (particularly in Eskimos), but is 
seldom found in European series. Deirtal anomalies appear to be exceptionally 
frequent in the short series of skulls from Maiden Castle. 

) 

but nearly all might be considered quite unexceptional if found in any British 
collection of later date than the Bronze Age. There are few examples, however, 
of the markedly retreating frontal bones which characterize the seventeenth- 
century London skulls, particularly those from Earringdon Street. The variation 
exhibited appears to be no greater than that expected for a community of inter- 
marrying people, except for the fact that one female skull from the Belgic War 
Cemetery (P 36, Plate 11 C, D) stands apart from the others on account of its 
aberrant form. The facial skeleton has an unusual premaxillary height, though 
it is not prognathous ; the nasal index is high but not extreme, and the nasal 
bridge is broad and depressed. The cephalic index of this specimen {87'0) is 
easily the highest for the series, and the individual was decidedly short 
(1451 mm. = 4 ft, 9|in,), though taller than two other, women interred in the 
Belgic War Cemetery. It is possible that the skeleton P 36 is that of an alien 
in Western Europe, but it appears to be more probable that its peculiarities are 
of individual rather than racial-significance. Its measurements were included in 
computing averages. 

3. Metrical comparisons oe the Iron Age and Romano- 
British crania 

Excluding the Wo Neolithic specimens, there are 30 adult male and 26 adult 
female crania sufficiently complete to give measurements, though most of these 
are defective to some extent. Individual readings are in the appended Table VII. 
In spite of the small numbers, it appeared worth while computing separate means 
of the more important characters for (a) the Belgic War Cemetery series, and 
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(b) all the other specimens of Iron Age or Roman date. The latter is a miscel- 
laneous collection made up by individuals belonging to the following groups: 



Male 



Male 

I'emale 

Iron Age A 

1 


Komano-British 

3 

3 

„ B 

1 

7 

Iron Age or Eomano-British 

2 

1 

„ 0 

6 

8 




The pooling of this material, and treatment of it as if all the specimens 
represented a single racially homogeneous population, is obviously a pis oiler. 
The means found for the composite group (Table III) are actually very close to 
those found for the Belgio War Cemetery series, which is almost certainly made 
up by individuals who belonged to a single community. On the supposition that 
the variabilities of the groups were of the usual order, all the diSerences between 
the two sets of means are quite insignificant. Both types have surprisingly large 
basio-bregmatio heights, and hence unusual indices (100 H'jL and 100 BjH') 
involving this diameter. Otherwise, they appear to have no features which can 
be considered at aU peculiar in English series of post-Bronze Age date. 

This comparison provides some justification for combining the two sub- 
groups, to give a single series which may be supposed to represent the population 
of Maiden Castle from Iron Age A to Roman times, though it is clear that the 
evidence available is quite inadequate to demonstrate that the racial constitution 
of the population remained stable throughout the period. The following distri- 
butions are found for the total sample. ' 


Cephalic index 
(oentral values) 

67 

69 

71 

73 

76 

77 

79 

81 

83 

86 

87 

Totals 



1 


2 

6 

6 

7 

■ 

B 

H 

■ 

HI 

23 

■ 

? 

— 


2 

4 

9 

4 

B 

B 

B 

B 


22 

■ 

Height-length 


M 





■ 

jm 

■ 

77'5 

78-6 

79-5 

Totals 

index 

68'5 

'US 


71'6 

72-8 

73-6 

74’6 

76'5 

76-6 

(central values) 









■I 

B 

■ 

B 


(? 


3 

1 

4 


M 


m 

B 


|l 


21 

? 

2 

1 

1 

1 


H 


1 




B 

22 


The highest cephalic index (87-0) is for a female specimen (P 36), in the 
Belgic War Cemetery series, which is also remarkable on account of the fact that 
its facial skeleton is of an unusual form. The highest height-length index is for a 

Biometrika .xx.xi 
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TABLE III 

Mean measurements of series of Iron Age and Eomano-British 
crania from Maiden Oastle 



Male 

Female 

Belgio Wftt 
Cemetery 

Other.') 

Total 

Bclgie War 
Cemetery 

Others 

Total 

L 

187-7 (14) 

189-9 (9) 

188-6 (23) 

179-0(7) 


180-1 (26) 

£ 

141-4 (14) 

140-0 (10) 

140-8 (24) 

135-7 (6) 

13.5-8 (16) 

135-8 (22) 

H’ 

1,S7-1 (14) 

136-9 (8) 

137-1 (22) 

134-4 (7) 

131-7 (15) 

132-6 (22) 

£' 

!)7-2 (14) 

96-9 (12) 

97-1 (26) 

92-5 (7) 

94-1 (19) 

9,3-7 (26) 

S 

384-5 (11) 

381-9 (7) 

383-5 (18) 

371-3 (6) 

365-6 (10) 

367-7 (16) 

U 

.528-6 (12) 


529-5 (20) 


508-1 (1,5) 

507-0 (21) 

fml 

37-8 (13) 

37-3 (8) 

37-6 (21) 

35-9 (5) 

36-3 (11) 

36-6 (16) 

Jmh 

31-6 (12) 

31-2 (8) 

31-5 (20) 

28-2 (6) 


28-6 (15) 

LB 

102-5 (13) 

102-2 (6) 

102-4 (19) 

96-7 (7) 

99-9 (12) 

08-7 (19) 

G'H 

73-3 (8) 

71-3 (6) 

72-4 (14) 

68-1 (6) 



Nil, L 

52-4 (9) 

51-2 (7) 

51-9 (16) 

46-7 (6) 


48-3 (18) 

N£ 

25-0 (8) 

25-1 (8) 

2.5-0 (16) 

22-4 (5) 

24-3 (8) 

2, 3-6 (IS) 

O^L 

43-8 (9) 

41-2 (6) 

42-6 (15) 

41-3 (5) 


41-3 (14) 

OtL 

33-2 (9) 

32-3 (6) 

32-9 (15) 

.32-.5 (5) 

31-9 (9) 

32-1 (14) 


75-4 (14) 

73-5 (9) 

74-6 (23) 

76-6 (6) 

76-6 (16) 

76-8 (22) 

100 H'lL 

73-4 (13) 

72-4 (8) 

7.3-0 (21) 

76-2 (7) 

72-9 (16) 

78-6 (22) 

100 BIH' 

103-3 (13) 

102-7 (8) 

103-1 (21) 

101-6 (6) 

102-6 (13) 

102-2 (19) 

Wfmbifvd 

83-5 (12) 

83-7 (8) 

83-6 (20) 

79-2 (5) 

82-9 (9) 

81-6 (14) 

100 NB/NH, L 

48-6 (7) 

50-2 (6) 

49-4 (13) 

48-1 (6) 

60-0 (8) 

49-3 (13) 

100 OJOu L 

76-3 (9) 

78-5 (6) 

77-2 (15) 

78-8 (5) 

77-2 (9) 

77-8 (14) 

PL 

8r)'’-7 (8) 

86°-0 (4) 

86“-6 (12) 

89°-0 (3) 

83“-9 (9) 

85°-2 (12) 


Total series 


■ 


Female 

B" 

122-4 (17) 

114-7 (16) 

Biad, B 

114-3 (19) 

107-2 (14) 

GL 

03-4 (14) 

90-3 (11) 

(IB 

95-2 (18) 

89-4 (14) 

J 

133-6 (11) 

123-2 (11) 

Gf 

46-6 (19) 

44-0 (12) 

G, 

40-7 (16) 

39-0(11) 

SB 

4-7 (16) 




Male 

Female 


316-7 (20) 

H 


114-2 (24) 

nroinai 


116-5 (22) 

112-2 (21) 


99-9 (22) 

97-0 (17) 

Si 

132-0 (22) 

126-6 (22) 

Si 


125-3 (21) 

s, 

122-3 (22) 

116-0 (17) 

G* 

1514-7 (19) 

1380-7 (19) 


IB 

Male 

Female 


10-1 (16) 
58-7 (22) 
76-3 (13) 
88-7 (12) 
47-6 (16) 
61°-0 (14) 
76'>-3 (14) 
42°-6 (14) 

9-3 (15) 
60-6 (17) 
73-6 (11) 
87-4 (7) 
47-6 (14) 
63''-2 (11) 
76“-2 (11) 
41‘’-6 (11) 


* Keoonstructed from L, JS and H\ 


male cranium (P 7 A) in the same series. In spite of these outlying cases, the 
distributions for the two characters in question, and those for the other cha- 
racters, provide no clear evidence of racial heterogeneity, though it cannot be 
inferred from them that the total population considered was racially homo- 
geneous. 
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If the supposition that all the skulls of Iron Age and Romano -British date 
from Maiden Castle represent a single population be accepted, the series available 
is still not long enough to give reliable racial comparisons of a statistical kind. 
A short series is liable to indicate an absence of differentiation from other series 
when one of an adequate length from the same population will provide evidence 
of distinction, and it will also be liable to give a misleading estimate of diver- 
gence from other types when distinction is clearly indicated. Large enough 
samples must be demanded, in particular, when making comparisons between 
closely related populations, such as the group made up by all the prevailing 
populations of England from the end of the Bronze Age to modern times. In 
spite of its restricted size, coefficients of racial likeness were computed for 
male means between the total “Iron Age” series from Maiden Castle and four 
others, viz. : 

(a) The Anglo-Saxon made up principally by skulls preserved in London 
Museums (Morant, 1926). 

(b) The so-called British Iron Age (Morant, 1926) made up by skulls from 
the south of Scotland and various parts of England. Several of the specimens 
are known to be of Romano-Britons, and some of the others are not dated 
satisfactorily, so the series is of little value. 

(c) The seventeenth-century series from a plague in Whitechapel (MacdoneU, 
1904), the revised means given by Hooke (1926) being used. 

(d) The seventeenth-century series from the cemetery in Earringdon Street 
(Hooke, 1926). 

The following coefficients of racial likeness are found for the total “Iron 
Age ” series from Maiden Castle and these four, the numbers in brackets following 
the crude values being the numbers of characters on which they are based, and 
the numbers in brackets following the names of the series being the average 
numbers of skulls on which the means used are based (ri’s)* : 


Maiden (Jastle (17 ’7) and Anglo-Saxon (34-1) 

„ (18'5) and British Iron Age (56'4) 

„ (17-7) and Whitechapel (90-7) 

„ (17-8) and Parringdon Street (96-7) 


Crude C.R.L. 
-0’28±0d8 (29) 
2-98 ±0-21 (20) 
2-79 ±0-17 (30) 
4'29±0’17 (30) 


Reduced C.R.L. 


10'73±0-77 

943-l-0'59 

14’27±0-68 


As far as can be told from the scanty evidence, the Maiden Castle and Anglo- 
Saxon samples might represent different sections of the same population, while 
the former series is clearly differentiated from the other three. Previous 
comparisons have shown that the British Iron Age and the two seventeenth- 
century London types are very similar, while the Anglo-Saxon stands apart from 
that cluster. These relationships suggested that the Londoners were descended 
primarily from the pre-Saxon rather than from the Anglo-Saxon population, but 
the new evidence does not support this view. 

* The standard deviations of the Farringdon Street series were used in computing the coefficients. 


20-2 
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For characters considered singly, there are very few significant differences 
between the means for the short Maiden Castle series and those for the three 
later series. All the differences from the Anglo-Saxon values are quitfe in- 
significant, the highest of the 29 a’s being 3-8. The only a’s greater than 10 are 
for the nasal angle (iV/t , 1 S- 0) and basio-bregmatio height (//', 1 1 ■ 8) in comparisons 
with the British Iron Age means, for H' (39'7), 100 IFjL (32-7) and 100 BjH' 
(28'7) in comparisons with the Farringdon Street means, and for H' (]8'9), 
100 HfL (16-3), N/. (15-2) and 100 BjH' (10-5) in comparisons with the White- 
chapel means. The Maiden Castle skulls are markedly orthognathous judging 
from the nasal angle, but this measurement is only available for 14 male skulls 
and little stress can be laid on the peculiarity. Otherwise the type is only 
distinguished by a calvarial height which is large both absolutely and also 
relative to the length and breadth of the brain-box. It is known that the 
Anglo-Saxon is distinguished in precisely the same way from the other types 
considered. 

Table IV gives male means of the three principal diameter.s and the three 
indices derived from them for the series referred to above and the following : 

(e) Three series of Romano -British skulls described by Buxton (1935), the 
six measurements in question being the only ones available for these. 

(/) An Anglo-Saxon series from Bnrwell, Cambridgeshire (Brash et al. 1936). 

(</) A third seventeenth-century London series from a burial-pit at Moorfields 
(Macdonell, 1906). 

(A) A modern series of Lowland Scottish skulls measured by Turner (1903) 
and compiled in the way described by Hooke (1926, pp. 22 and 38). 

(i) A modern series from Glasgow (Young, 1931). 

All the series included in the table are believed to represent populations 
which prevailed in England and the south of Scotland at certain periods from 
the beginning of the Iron Age to modern times, and the minority populations for 
which there is craniological evidence are omitted. The length, breadth and 
cephalic index are seen to be remarkably constant throughout, while the basio- 
bregmatio height and the two indices involving this diameter show larger 
differences. The greatest heights, highest height-length and lowest breadth- 
height indices are for the two series from Maiden Castle, and they thus show a 
remarkable concordance in spite of their very restricted sizes. In these respects 
one of the Romano-British series (the Brigantes) and the two Anglo-Saxon come 
next, and the others follow, with one other Romano-British (the Dobuni) and 
two of the seventeenth-century London series at the lower end of the scale. 
Little significance can be attached to this order, however, as some of the series 
are short. There are no statistically significant differences, for example, between 
the means for the total series from Maiden Castle and those for the Romano- 
British Belgae. The populations represented in Table IV are all remarkably 
similar in type, and they must have been closely inter-related. Larger series 
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TABLE IV 

Afeau meas'iM'emewfe of British series of male skulls 


Iron Age and 
Eomano- 
British series 


“British 
Iron Age” 

Romano-British 

Belgio War 
Cemetery 

Others 

Total 

Belgae 

Dobuni 

Brigantes 


187-7 (14) 
141-4 (14) 
137-1 (14) 
76-4 (14) 
73-4 (13) 
103-3 (13) 

189-9 (9) 
140-0 (10) 
136-9 (8) 
73-5 (9) 
72-4 (8) 
102-7 (8) 

188-6 (23) 
140-8 (24) 
137-1 (22) 
74-6 (23) 
73-0 (21) 
103-1 (21) 

187-4 (61) 
141-4 (102) 
132-9 (77) 
76-4 (61) 
70-9 (61) 
106-3 (77) 

189-6 (40) 
141-0 (41) 
134-2 (33) 
74-4(40) 
71-0 (33) 
106-6 (33) 

190-8 (86) 
144-2 (77) 
132-6 (46) 
7S-6 (71) 
69-4 (43) 
109-9 (41) 

189-9 (67) 
141-7 (67) 
136-8 (38) 
76-7 (61) 
71-4 (34) 
104-4 (33) 


Other 

series 

Anglo-Saxon 

Seventeenth-century London 

Modern Scottish 

■ London 
mnseums 

Burwell 

Farringdon 

Street 

White- 

chapel 

MoorOelds 

Lowland 

Glasgow 

L 

B 

H' 

100 BIL 
100 E'lL 
100 B/H’ 

190-6 (58) 
141-7 (103) 
136-0 (31) 
74-7 (52) 
71-2 (25) 
104-9 (61) 

189-6.(45) 
141-7 (46) 
136-3 (40) 
74-8 (48) 
71-9 (40) 
104-3 (40) 

188-8 (139) 
142-4 (141) 
129-7 (118) 
75-4 (132) 
68-6 (116) 
109-8 (117) 

189-1 (137) 
140-7 (136) 
132-0 (122) 
74-3 (131) 
70-0 (120) 
106-6 (122) 

189-2 (44) 
143-0 (46) 
129-8 (34) 
75-5 (42) 
68-4 (31) 
110-2 (34) 

188-8 (64) 
142-1 (64) 
133-6 (52) 
76-3 (84) 
70-9 (62) 
106-4 (62) 

188-2 (524) 
139-1 (524) 
132-9 (521) 
76-0 (524) 
70-7 (621) 
104-9 (521) 


representing the Iron Age population are obviously required, but the data 
available clearly focus attention on mean difierences between the absolute and 
relative magnitudes of the calvarial height. It can be seen from the distribution 
on p. 301 above that all the male skulls from Maiden Castle have height-length 
indices greater than 69-0. In Table IV there are two seventeenth-century London 
series with means for the index less than this value, and one of the Romano- 
British series has a mean of 69'4. These conditions indicate unusual separation 
of the distributions for different series, and there is no reason to suspect that any 
cranial characters other than the height and the height indices would distinguish 
the populations represented in Table IV as effectively. 


4. Measurements oe the mandibles 

Measurements of the mandibles were taken in accordance with the revised 
technique given in Biomeirika (Morant et al. 1936, Appendix), and readings for 
individual bones are not provided in the present paper. It has been found 
(Cleaver, 1937) that larger numbers of lower jaws than of crania are needed to 
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give any decisive racial comparisons, and hence the means for all the adult 
specimens of Iron Age and Romano-British date taken together are the only 
ones worth considering. They are given in Table V for totals of thirty-four male 
and twenty -five female bones, and values for Anglo-Saxon (Morant, 1926) and 
a seventeenth- century London (Cleaver, 1937) series are included for com- 
parative purposes. 


'fABLE V 

Mean nmaurmentn of English series of mandibles 



MnU'. 

Female. 


Anglo- 

Saxon 

Maiden 
Ca.stle: 
Iron Age 

Fari'ingdon Street, 
London: seven- 
teenth century 

Anglo- 

Saxon 

Maiden 
Ciwtle: 
Iron Age 

Farringdon Street, 
London: seven- 
teenth century 

CfCf 

zz 

Cyl 

ml 

Cyl 

rf/ 

mJi 

Cfh 

fl 

Ml 

Rl 

C'l 

100 

100 CrCflml 
100 Mfj! Cyl 

100 rb'jrl 

100 Sol/o/CrC 

12!i'7 (25) 
1004 {30) 
100'3 (27) 
4f)-3 (57) 
2L7 (38) 
107'2 (31) 
77'2 (42) 
33'2 (61) 
284 (69) 
m (40) 
27'2 (51) 
06'7 (48) 
64'0 (46) 
120“-3 (47) 
72°'0 (36) 
68'’'2 (32) 
60'9 (27) 
944 (15) 
129'0 (32) 
61'6 (46) 
99'3 (19) 

m-ri (19) 
99'5 (25) 
99'3 (20) 
44-8 (33) 
21'0 (27) 
105'3 (24) 
79'0 (27) 
33-1 (33) 
284 (23) 
34'7 (22) 
28'2 (20) 
70'6 (28) 
6B'2 (26) 
116°'5 (27) 
77°'6 (24) 
69“'8 (18) 
67'9 (22) 
93'9 (17) 
1274 (25) 
60'8 (26) 
1024 (18) 

117-7-K)'5:t (23) . 
97'7-t0'70 (40) 
96-9-K)-68 (29) 
43-9-t0’24 (40) 
19'8-i-0-20 (34) 
104-1 + (t64 (34) 
74-9 -t 0-40 (40) 
30-9 + 0-28 (40) 
28-2 + 0-18 (22) 
30-9 + 0-46 (12) 
24-9+0-45 (19) 
64-8 + 0-47 (40) 
62-2+0-40 (36) 
121M + 0-60(40) 
72'‘-0 + 0-94 (37) 
61°-8 + l-3S (16) 
62-4 + 0-66 (34) 
92-3 + 1-03 (25) 
130-9+1-23 (40) 
60-0+0-54 (36) 
102-9 + 1-07 (29) 

116-6 (22) 

92- 9 (36) 

93- 2 (28) 
44-1 (50) 
19-1 (35) 

104-2 (45) 
74-6 (49) 
31-0 (6(i) 
27-6 (57) 
30-5 (31) 
24-4 (52) 
69-2 (47) 
59-1 (45) 
122'’-6 (49) 
68'’-2 (36) 
70“-0 (25) 
68-3 (38) 
91-7 (26) 
126-2 (35) 
63-0 (43) 
90-3 (23) 

117-5 (12) 
90-8 (18) 

95- 0 (18) 
42-6 (24) 
20-0 (14) 
98-9 (16) 
70-6 (19) 
30-4 (24) 
27-2 (19) 
30-B (17) 
26-6 (16) 
60-2 (26) 
67-3 (20) 

123'’-! (19) 
71°-6 (18) 
70°-3 (14) 
60-6 (16) 
97-1 (12) 
129-2 (18) 
53-5 (19) 

96- 2 (14) 

110-8 + 0-64 (30) 
85-7 ±0-65 (60) 

91- 7 ±0-54 (35) 
42-9 ±0-25 (49) 
18-0 ±0-10 (45) 
99-4+0-81 (43) 
69-7 ±0-37 (60) 

28- 3 + 0-26 (60) 
27-7 ±0-22 (19) 

29- 7 ±0-36 (26) 
23-6+0-51 (12) 
56-5 + 0-47 (60) 
63-6+0-44 (43) 

127'‘-8±0-69 (60) 
7r-2 + 0-87 (48) 
67°-3 + 0-80 (27) 
56-8 + 0-49 (43) 

92- 4 ±0-74 (30) 
123-3 + 1-02 (60) 

53-2 + 0-66 (43) 
94-0-1- 0-86 (36) 


None of these is long enough to give an adequate representation of the type 
for the population it represents, and probable errors are only available for the 
last, A few general comparisons are sufficiently suggestive, however, to be of 
interest. For both sexes the two earlier types are very similar in size, and for 
several characters both appear to be significantly larger than that of the 
Farringdon Street series. There is only one measurement of size which provides a 
clear exception to this rule — viz, the length of the dental arcade from second 
molar to first premolar — and it is the only measurement taken relating to 
the size of the teeth. Both Anglo-Saxon and Iron Age types appear to be 
distinguished from that of seventeenth-century Londoners on account of less 
protruding chins [O' L greater), and the Iron Age appears to be distinguished 
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from the other two on account of a more outstanding ooronoid process (judging 
from Ml and 100 c^hjml). Otherwise, there is no clear evidence of distinctions 
between the types. 

Judging from all the characters, the Anglo-Saxon and Maiden Castle popu- 
lations were just distinguished by features of their lower jaws, but the resemblance 
between them was closer than that between either and the later Londoners. The 
relations found favour the hypothesis that the mandibles of Englishmen, but 
not their teeth, became slightly smaller in historical times, but more data will be 
needed to substantiate it. If the hypothesis be accepted, then mandible measure- 
ments should not be used to estimate racial relationships unless allowance is 
made for secular changes in them within the same population. 


5. Measurements oe the lengths of the long bones 

Individual readings are given in Table VIII, and means of the' adult series for 
the characters of greatest interest are in Table VI.. The lengths were taken in the 
ways specified by Munter (1936), whose means for Anglo-Saxon skeletons are 
quoted. In spite of the small numbers, it was thought worth while computing 
separate means for (a) the Belgic War Cemetery series, and (6) for all the other 
specimens of Iron Age or Roman date from Maiden Castle. 

Eor both sexes all the absolute measurements for these two series in Table VI 
are less than the corresponding values for Anglo-Saxons. A rough appreciation 
of the significance of the differences between the means can be obtained by 
supposing that the two Maiden Castle populations considered, and also that 
made up by combining them, exhibited the same variabilities as the total 
Anglo-Saxon population. This may appear to be a very arbitrary assumption, 
but in fact it is not unreasonable, since a close approach to equality in variation 
is usually found on comparing different subgroups with a total population, and 
also on comparing distinct populations. 

By applying the Anglo-Saxon standard deviations, we reach the conclusion 
that there are no significant differences between the means of the absolute and 
indicia! measurements m the case of the Belgic War Cemetery and the other 
series from the same site, and this is true for both sexes. On the same assump- 
tion with regard to variation, it is found that the pooled Maiden Castle means 
only show markedly significant differences from the Anglo-Saxon in the case of 
the length of the femur for males, the length of the humerus for males and 
females, and the radius-humerus index for males. The differences between the 
statures cannot be supposed clearly significant in the case of either sex, but it is 
safe to conclude from the evidence of both that the Iron Age inhabitants of 
Maiden Castle were shorter than Anglo-Saxons. The sex ratios for stature are 
almost identical, being 1'076 for the former and 1-073 for the latter series, and 
of the order usually found. In inches the estimates of height obtained are 
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TABLE VI 

Means for lengths of the right femur, tibia, humerus and radius, and indices derived 
from these lengths, for Maiden Castle (Iron Age and Romano-British) and 
Anglo-Saxon series of adult skdetons 


■ 

Male 

Other than 
Bclgic War 
Cemetery 

Bclgic War 
Cemetery 

All 'Maiden 
Castle 

Anglo-Saxon 

P. max, 

T, max.* 

H. max. 

E. max. 

lOOT.obl./P.obl. 

100 H. max./F. obi. 

100 R. max./H. max. 

I00(H.max.-l-R. max.)/(F. obl.-t-T. max.)t 
Reoonatriiotcd stature 

437’6 (U) 
362-8 (11) 
323-0 (12) 
240-8 (10) 
81-2(10) 
73-9 (11) 
76-4(10) 
71-2 (9) 
1640(13) 

443-2 (15) 
371-2 (18) 
328-8 (17) 
250-9 (16) 
82-5 (14) 
74-3 (13) 
77-2(16) 
71-6(12) 
10,64(19) 

1 

463-3 ±1-22 (163) 
378-9 ±1-46 (103) 
337-1 ±1-16 (121) 
261-6-H-12 (79) 
81-U0-17 (92) 

73- 5±0-14 (100) 

74- 6-f0-19 (62) 

70-7 + 0-16 (41) 

1683 (161)f 

Femald 

F, max. 

T. max.* 

H. max. 

R. max. 

100 T. obl./F. obi. 

100 H. raax./F. obi. 

100 R. max,/H. max. 

100 (H.max.-t-R. max.)/(F. obl.-t-T. max.)t 
Reconstructed stature 

411-9 (13) 
336-4 (14) 
300-8 (11) 
223-0 (13) 
8H (13) 
74-6 (8) 
74-2 (9) 
70-8 (7) 
1536 (17) 

411-0 (8) 
342-7 (7) 
290-4 (8) 
216-1 (7) 
81-2 (6) 

72- 3 (7) 

73- 4 (7) 
69-7 (6) 

1627 (9) 

411-5 (21) 
338-6 (21) 
298-9 (19) 
220-3 (20) 
81-4(19) 
73-6 (16) 
73-8 (16) 
70-3 (12) 
1532 (26) 

426-1-1- 2-05 (66) 
360-4±l-99 (44) 
312-6 -I- 1-64 (47) 
227-6-tl-4l (34) 
80-8 -t 0-20 (38) 
73-6 -t 0-26 (36) 
73-6 -I- 0-22 (31) 
70-6-1-0-21 (21) 
1668 (69)t 


* Maximum length of the tibia including the spine. f Maximum length of the tibia excluding the spine, 
J The reconstructed statures for Anglo-Saxons were found from the mean lengths for different long bones, 
and the numbers of individuals given are the numbers of femora involved. The statures actually relate to rather 
larger numbers of skeletons. 


5 ft. in. for Anglo-Saxon and 5 ft. 5 in. for Iron Age men, and 6 ft. in, for 
Anglo-Saxon and 5 ft. 0| in. for Iron Age women. The average stature of the 
general male adult population of England to-day is about 5 ft. 7|in., and for 
different social classes means between about 5 ft. 5| in. and 6 ft, 9^ in. are 
found. The Iron Age men were thus decidedly short compared with modern 
Englishmen, and the estimated stature for them is very close to the average 
found for all European populations to-day. 

For the measurements considered, the only distinction between the two 
ancient populations other than that in size is found for the index expressing the 
length of the radius as a percentage of the length of the humerus in the case of 
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the male, but not in that of the female, sample. It might be suggested that the 
proportions of the upper and fore-arms of the men who lived at Maiden Castle 
were influenced on the right, or on both, sides by continual practice in the use 
of the sling, which is known to have been one of their principal weapons from 
the evidence of caches of stones. The following means, which relate only to 
paired indices, are of interest in this connexion: 



100 X Eadius max./Humerus 

max. 

100 X (H. max. + E. max.)/ 

(P. obi. +T. max. ex spine) 


Maiden Castle 

Anglo-Saxon 

Maiden Castle 

Anglo-Saxon 


Male 

Female 



Male 




E. 

L. 

E.-L. 

76-3 (18) 
78-4 (18) 
-2-1 

73-6 (14) 
75'0 (14) 
-1-4 

74'6 (25) 
75-7 (25) 
-M 


7.1-5 (11) 
70-8 (11) 
+0-7 ■ 

70-3 (12) 
68-9 (12) 
1*4 

70-9 (21) 
70-0 (21) 
+0-9 



All the series are very restricted in size, but there is complete agreement in 
the signs of the side and sex differences found. The most significant difference 
between the two populations appears to be that for the male radius-humerus 
index on the left side. This might suggest that the Maiden Castle men were left- 
handed slingers, but their intermembral indices are so close to the Anglo-Saxon 
values that it seems unsafe to accept the hypothesis that the lengths of their 
arms were affected by use. More abundant material might tell definitely for or 
against such a view, and a detailed anatomical examination of the arm bones of 
the Maiden Castle and modern aeries would also be relevant to the question. 

6. Stjmmaey and conclusions 

The material studied consists of the imperfect skeletal remains of eighty- 
three individuals, who are distributed in Table I according to periods, sexes, 
and ages at death. This paper is concerned chiefly with the customary measure- 
ments of the crania and mandibles, and with the lengths of the long bones. 
Comparisons of various kinds are made between: 

A, the Belgic War Cemetery series, representing the defenders of the Castle 
who were massacred by Eoman invaders in a.d. 43, and 

JS, a composite series made up by all the other specimens of Iron Age and 
Roman date. 

There is no clear distinction between the distributions of the ages at deaths 
of the adults forming the two short series, but for both the men died at a rather 
younger age, on the average, than those interred in English cemeteries of later 
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date, while the females are not distinguished in the same way. Remarks on 
cranial anomalies are provided. The teeth of the inhabitants of Maiden Castle 
were not well preserved. 

Judging from their appearance, nearly all the crania in series A and B would 
be considered quite unexceptional if found in any British collection of later date 
than the Bronze Age, but there are few examples of the markedly retreating 
frontal bone which characterizes seventeenth-century Londoners. There is 
agreement between the male and female samples and this is confirmed by 
measurements of the skulls and long bones. 

There are no statistically significant differences between the mean cranial 
(Table III) and long bone (Table VI) measurements of series A and B. Both are 
characterized by a large calvarial height, and the estimates of stature they give 
are about 1 in, less than the Anglo-Saxon values. The type of the total series 
(A+B) of crania is found to be indistinguishable from the Anglo-Saxon, while 
both are differentiated from the types of seventeenth-century London series. 
This conflicts with the conclusion reached previously, from very inadequate data, 
for Iron Age and Romano-British crania, to the effect that the Iron Age and 
recent types are very similar while the Anglo-Saxon stands apart from both. 
More abundant material representing the Iron Age and Romano-British popula- 
tions will be requued to disclose the relationships of these closely allied groups, 
The cephafic index is practically constant for them, and the cranial types are 
distinguished most clearly by differences m the absolute and relative magnitude 
of the calvarial height (Table IV). 

Measurements of the mandibles make a slight distinction between the 
Maiden Castle and Anglo-Saxon series, while both types are larger than that of 
seventeenth-century Londoners. The Iron Age men, but not the women, are 
distinguished from Anglo-Saxons by having a larger radius-humerus index. It 
is not clear that this difference is due to the fact that the men at Maiden Castle 
were slingers. 

APPENDIX 

Tables {VII and VIII) of individual measurements and remarks 

The letters denoting periods (or groups) givou in tlio third columns of the tables are: 
N. =N6olithic, I.A, = Iron Age (A, B or G), S. = Saxon, R.-B. = Romano-British, B. = Belgic, 
B.W.C. = Belgic War Cemetery. The letters denoting cranial measuromontB arc those used 
in all craniometrio papers in Biometrika. A list of thorn is given in the pro, sent volume, 
p. 162, but this does not include B"=maximura frontal breadth (Martin, No. 10), and 
Biast. B. =biasterionio breadth (M. 12). Owing to their fragility, the capacities of the crania 
could not be determined by any direct method. The estimate,s given in Table VII were 
obtained by applying the reconstruction formulae involving L, B and B' given by Hooke 
(1926, p. 33). The reconstructed statures in Table VIII were obtained by applying the 
formulae given by Pearson (1898), using as many as possible of the bones involved in each 
case. The lengths of the long bones were determined m the ways adopted by Miinter (1936). 
All the readings given in the two tables, whether queried or not, can be considered close ap- 
proximations to the true values, with the possible exception of the few measurements of 
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theNeolithie skeletons enclosed in square brackets. Those are more uncertain than thcothors, 
but they are given because the specimens are of particular interest. 

Unless otherwise stated in the remarks given in Table VII, the cranium and mandible 
are complete, or almost complete, with both dental arcades intact and no teeth lost from 
either jaw before death, and the coronal, sagittal and lambdoid sutures are open. Absence of 
a third molar denotes that the tooth had never erupted, as far as can be told. Two male skulls 
are not in the tables, since no measurements can be given for them, but they were included 
in determining the frequencies of qualitative features. These are; 

P 24 (B.W.C.). Sagittal suture closed, coronal and lambdoid beginning to close. Upper 
jaw incomplete and some teeth lost before death, apparently only 1 incisor R. ; 2 teeth lost 
from mandible and M3’s absent, abscess cavities at roots of M I’s. 

R2 (?I.A.). Sagittal suture closing. Upper jaw missing. Superficial woimd on frontal 
bone. 


LIST OF PLATES 

Plate I. A typical male skull (P 30) from the Belgic War Cemetery. This specimen has a cephalic 
index which is very close to the mean for the series, but its height-length index (70'9) is below 
the average (73'4) and it is peculiarly orthognathous (PZ. =90°). The lower incisors are crowded. 

Plate n. Exceptional skulls from the Belgic War Cemetery. 

A. A female adolescent cranium (P 20) with injury to the right malar bone. 

B. A male cranium (P 7 A) with a trace of a suture on the right parietal bone. 

C and D. A female skull (P 36) with an exceptional type of facial skeleton and high cephalic 
index (87'0). 

Plate III. Skulls from Maiden Castle with dental and other anomalies. 

A. A female cranium (T 21) with cyst cavities at the roots of the second premolars. 

B. The broken basi-oooipital part of a male cranium (0 3) showing the extension into it of 
sphenoidal air sinuses. 

C. The palate of a male cranium (P 7) with a large anterior palatine foramen. 

D. The palate of a female cranium (T 28) showing absence of the canines and rotation of the 
first premolars. 

E. A female mandible (P 14) with the left third molar impacted and mandibular torus. 
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Goodraaii and Morniit: Human Hmains fimii Maidm Gaslk 


Plate II 



A, P 20, fcmalo. Iiijuvy to rigid mnliu' l)oiin. B. P 7 A, malo. '.rnuf. of siitiiw on I’iglit iiai’iotal bone. 




0. P 36, female. Exooptioiml type of facial D. P 36, female. Profile view of the same skull (C). 

skeleton and high cephalic index (87'0). 


Exceptional skulls from the Belgic War Cemetery. 
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Plate III 


Skulls from Maiden Castle with dental and other anomalies. 


D. 


T 2S, female. Absence of canines and first premolars 
rotated. 


E. P 14, female. Left third molar impacted and 
mandibnlar torus. 


A. T 21, female. Cyst cavities at roots of second preinolars. 


C. P 7, male. Large anterior palatine foramen. 


B. 


0 3, male. Broken basi-oocipital allowing the 
extension into it of sphenoidal air .sinuses. 



HOMOGENEITY OE RESULTS IN TESTING SAMPLES 
FROM POISSON SERIES 

WITH AN APPLICATION TO TESTING CLOVER SEED EOR DODDER 


By J. PRZYBOROWSKI and H. WILENSKI 
Plant Breeding and Agricultural Experimentation Department, 

University of Krakow, Poland 

1. Inteodtjotoby 

In our previous paper, “Statistical principles of routine work in testing clover 
seed for dodder” (Przyborowski & Wilenski, 1935), we justified the assumption 
that the number of dodder seeds in samples of clover does follow the Poisson 
Law: 3 , 

p{x) = e-^~ (« = 0 , 1 . 2 ,.,.). ( 1 ) 

X ! 

Thus we constructed rules which should be followed in sampling problems 
where that Law holds good. We analysed particularly the application of the 
rules to the routine work in testing clover seed for dodder. 

The purpose of this paper is to present some theoretical considerations 
relating to the problem of homogeneity of results in testing samples drawn 
from Poisson series which we have found arising in the course of our work on 
testing clover seed. 

Let % and denote the parameters of the Poisson distributions of the 
variables and ; the problem to be considered consists in testing the hypo- 
thesis Hj(%=m 2 =m), that the values and were obtained in sampling 
from Poisson series, the parameters of which have a common but unspecified 
value, say m. 

The method of testing statistical hypotheses as developed by Neyman & 
Pearson (1933) consists in selecting a rule ofrejecting the hypothesis in question, 
whenever the sample point E (the co-ordinates of which in the w-dimensional 
sample space W are the data of observation aii, ajj, lies within the 

so-called critical region, say w, of the sample space W. The probability, 
P{Eew I Hg} = a, of rejecting the hypothesis tested, when it is true, is called 
the size of the corresponding critical region w on which the test is based. The 
errors thus committed when rejecting the true hypothesis J?g, are called the 
errors of the first kind. The probability P{Eew \ H'} of rejecting the hypothesis. 
Hq when an alternative H' is true has been termed the power of the test with 
regard to H'. P{Eew | H'} considered as a function of H', where H' is any 
hypothesis belonging to the class Q of alternatives to is called the power 
function of the test. The error committed when we fail to reject Hg although 
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The hypothesis to be tested, is that or that p = | ; we shall 

suppose that the admissible alternatives to are both that mi>m^ and 
% < Wig or that p< \ and p>\. Since does not specify the value of p (which 
may have any value between 0 and co), it is a composite hypothesis in the 



Fig. 1. Critical region for case a=0-05. It consists of the two areas containing the larger dots. 

sense of Neyman & Pearson. To test it, we should like to be able to use a critical 
region, w, such that 

P{Eew\p = l,p} = a, ( 9 ) 

whatever be /(. No such region can be found, owing to the discontinuous 
character of the distribution of the variables, but it is possible to go a long way 
in determining a satisfactory region on the following lines. 

It will be noted that if p = i, then the distribution of on each line, 
n = constant, is that of the symmetrical binomial (i + i)™. If p is either >\ 
or < we shall have a skew binomial with probability density concentrating 
towards the lower or upper end of the line; in order to make the test as efficient 
as possible in detecting departures from p = | in either direction, the critical or 
rejection region should therefore include the two tails of each of the separate 
binomials It may be defined as follows. 


BuiW^ip the mtica! regiitii w'«t "p?*'!*---**". 'th»' lire-.- 

ft = ,r, -f Xj 4‘nH?^t4(!if in tt, (1(|) 

where win, a) inclutii'H all ».f«n}i!r jw-inl* «)U s lt»; » hrij 

,r, d'jft. .t'l aH«! ,rj -5i 5"^ Hlj 

k(n, a) being a jKimlivc integer tietern«ii*-»i th«« 

(a) theauni nrthr trrnwnf tin* hitnmtial j| >! if' > Ui 

ft, I, .... iph .'f.f if ■ I r , 

(/}) tlie Huni th tlu* terms cstrn'sieniniing 

f). I, ..., l'i«. xl km, ti ■lie* « ; *' 

The pieces icpf, .aj and {mrl »f tin* f^mipleSe >*■ i*'J' *h» ■ '}.'•*'• (ttirj ^re 
indicated in Kig. 1. In view nf tin* P»nn xtsh*' }t5->’lt;s’f'sl!^y ni) it will 
seen that the reginti ir is sncli 

Pl/Lfji‘"iii |.. '*■ fl2) 

whatever be /f. TIiuh if wn nderi fh»* /i.„ whp ii 

whenever the Hjiinple iHUitf i^ Jwhided in w, j«’ f«li« a* asiy *4 iW p-mls in- 
dicated in Fig. thy the larger s}K>t*.w»''kinnv <!»>»* t}3»«Tjft4 *4 !iir |jr»l kiminfi'mji’ 
is at inoHfc equal to a. Further, while jt t-annoi I*** eh»5ni«*«| she lent it* wm. 
pletely unbifMfCKl «fr is the unilonnly rH*»i |S(Mett?»l , * n wiw likely 

to be iwolHcient jw any other altermiiivr m dot*-, i mij,* *if'|«niMrriMiipfrHin 
In Table I we have given the iHanutiiiry vaIuoi* kin. x) «4 liir mlrid jt*|ifins 
associated with ftmr valuw of je, vi/, n 2tf. ff iff, m »ti, p.f « 
varying from (I to 80. Having «leridr*«i on the apj»rs*|<ria*-<'' v nl'.i*' }<>r r, the rule 
of the test is: reject the hy}Mitlwwi.s that j», si 

/( %: lin, X] or X, rs - kirn , -s* J 

It will be seen that for small viil«p*i, of t» n*» tc»t iWiwo- ?«!'»'» I a sti? iltc t requite 
may bo possible. 

The power function of tin* tiJsl. who’h d!C|a’f»4»» Ijuail'i ,p aicI f, iiwy be 
calculated from the exprtwion 


Iff 


V 


H-fJ 


il tf! ftn«„ isX) Xj I 


,#'" 4 ! r 


413} 


where is the sum of the terms of the Imuumal r 
'«(a. ») 


satisfying the conditions ( U }. * ’ 

Values of thepower forthetW'otMwaE ,» ►- « «.’t, i«i4fBr*miin.b®f 

of values of ^ and p are given in Tab!© il a and h. llir «*rr4|:Miiaiwii itifolwrf 
(a) first finding for a given n the auma of tlie btin »riiiai,| tt*rsMit ift the «f»lld 
summation of (13) with the help of the TaMa tif ike imram^drsie .Brta FttiAa 
♦ These terns have be« tpjgjed to proMw» of *«'**l*«. 
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(li)34), and (/;) then multiplying this by the appropriate term of the Poisson 
series, and (c) finally summing for •» = 0, 1, 2, ... . 

It will be noted that if the admissible alternatiyes to = % are only that 
then the appropriate critical region will be defined by x-^^n-kin, a), 
and the risk of the first kind of error will now be < ^a. Similarly, if the alter- 
natives are only < m^, the region will be defined by $ k(n, a). These regions 
correspond to the lower and upper marked areas, respectively, of Fig. 1. 

We have also given in Tables 11 in certain cases the actual value of 
that is to say, the chance of rejecting the hypothesis that 
= mj when it is true. These chances are, as expected, considerably less than 
the corresponding values of a, but approach a as /i increases and the dis- 
continuity in the distributions of and becomes of less importance. In 
practice we shall not, of course, know exactly what /t is, but as we shall probably 
have a rough idea of the size of the Poisson m in the particular problems 
investigated, the figures in these columns of the Tables will probably be helpful 
in determining the value to select for a, the upper limit of the significance level. 
Table III contains true significance levels for the other two limiting values, 
a = 0-20 and a = O-Ol, for which the rejection levels k{n, a) have been given 
in Table I., 

Prom the data of Tables II it was possible to construct the charts given in 
Figs. 2 and 3 (pp, 322, 323 below), showing in terms of the Poisson parameters 
and (rather than /i and p) some of the contours of equal power. 

3. Discussion of tables and ohaets with some 

ILLUSTRATIVE EXAMPLES 

A point which is clearly brought out by the present investigation is that it 
is impossible to detect differences between the means of two Poisson series 
mj and m^, when both these quantities are small, even if one is several times 
larger than the other. Thus even were = 0 and = 5, the chance of 
detecting the difference from the samples would be only 0-384 using a = 0-06, 
and 0-560 using a = 0-10.* As an illustration of the kind of differences that may 
be detected, Fig. 2 may be used as follows. It might with reason.be regarded 
as undesirable to plan an experiment in which the chance was less than 0-5 of 
detecting from two random samples that differences between and m 2 of 
practical importance existed. If we pass along the 50 % power contour in the 
diagram, we pass through points (%, m^) of which the following are typical; 

(0-0, 4-7), (2-0, 8-1), (6-0, 14-3), (10-0, 19-8), (18-0, 30-6). 

If it is important to be able to detect differences of this magnitude, then larger 
samples must be taken in order to increase the expectations. If the samples 

* The true values of the significance level in these oases would be 0-09 and 0-024 respectively ; 
if the. test -were used with a = 0-20 (and a- significance level in this case of 0-065) the ohance of 
detection -would be somewhat greater, 
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can, for instance, be increased c-fold, the expectations will be increased to 
cm^, m^, a point lying on the same straight line passing through the origin as 
m^, mj, but placed c times further out. Since such radial lines will cut the 
contours of equal power at rather acute angles, considerable multiplication of 
sample sizes may often be needed to ensure detection of differences. 

Regarding a chance of detection of 19 to 1 as satisfactory, we note that the 
96 % contour in Pig. 2 passes through points m^) of which the following 
are typical : 

(0-0, 9-1), (2-0, 16-5), (6-0, 24-0), (lO'O, 31-5). 

Differences of this kind, if they exist, can therefore be almost certainly detected 
using in the test the critical values associated with a = O’ 10. Rather smaller 
differences could be detected with the same high probability if a were taken 
as 0’20. 

In practice, of course, expected values and are not known in advance, 
but we believe that reasonings of this kind based on the tables or charts may 
be useful as a preliminary step in planning the extent of sampling required 
either in experimental work or routine analysis. 

Example. In the testing of clover seed for dodder it is customary to withdraw 
a 100 gr. sample from a sack with a long trier or probe. Clearly the number 
of dodder seeds found will vary from sample to sample, and from one sack to 
another in the same consignment. This may be due to the following causes ; 

(1) The material from which the samples were drawn was not thoroughly 
mixed. 

( 2 ) The two samples were drawn from different seed stocks. 

( 3 ) There are laboratory errors in analysis. 

( 4 ) The existence of random sampling fluctuations. 

In applying statistical analysis in the comparison of two counts, the 
hypothesis to be tested assumes that the discrepancy to be tested is merely 
due to chance, and that the frequency of dodder seeds, x, will vary from one 
sample to another in accordance with the Poisson law. If a significant difference 
is found, it may of course be due to one of the first throe sources of error. 

Suppose that a 100 gr. sample is drawn from a sealed sack of clover and 
sent to a seed testing station, and that no dodder seeds are found in it, i.e. 

= 0. The buyer, before opening the sack, takes another 100 gr. sample, and 
on having it analysed learns that three seeds have been found, i.e. = 3 . 
Would he have any justification for criticizing the first testing on the grounds 
that the difference between 0 and 3 could not be due to chance? On examining 
Table I, it is seen that for w =*1+3:2 = 3 , even using the least stringent of the 
four tests, the difference cannot be regarded as significant.* 

Suppose that on another occasion it was found that- 3:1 =2 and *2 = 6, 

* It will be noted from Table III that with /t equal to 6 or less, the true significance level for 
the first test with a = 0’20 is at about 0’06. 
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Entering Table I with ?^ = 8, we see that for none of the levels of test does 
— 2 fall in the critical region. If the purchaser still considered that to find 
three times as many dodder seeds in one sample as in the other could not be 
due to chance, we might illustrate the danger of drawing conclusions in this 
way by a use of one of the charts. If one of the expected numbers were three 
times the other, so that == 3mi, and therefore p = 0-25, we may ask how large 
/i = must be before there is as much as an even chance of detecting the 

difference. Taking Pig. 3, it is found that the line mg = Sm^ cuts the 60 % power 
contour at about the point = 13-8, m^ = 4-6. Thus samples of clover at least 
twice as large as those taken would be needed to provide only a 50: 60 chance 
of detecting a real difference in dodder content of the order suspected by the 
purchaser. In this case the true significance level of the test used is seen from 
the number on the diagonal line of the chart to be about 0-03. 

To be almost certain of detecting a difference with = Sm^, for example 
to make the odds 19 to 1, we notice that a continuation of the same line cuts 
the 96 % power contour where = 38-4, mj = 12-8. For this samples of about 
600 gr. would be needed. 


4. Summary 

The problem considered is that in which and Xg are two independent 
random variables distributed in accordance with the Poisson law of equation (1), 
and it is desired to test the hypothesis that the expectations mj and mg are the 
same. We have shown how a test may be derived which is independent of the 
value of the unknown common hypothetical expectation but which, owing to 
the discontinuous nature of the probability distributions, will only provide an 
upper limit to the significance level, i.e. to the chance of rejecting the hypothesis 
tested when it is true. The manner of approach of the significance level to its 
upper limit has, however, been investigated numerically. 

A table (Table I) has been provided, containing critical values k{n, a) 
required in carrying out the test. The power function of the test has also been 
determined, and tables (Table II a, b) and charts (Figs. 2, 3) given which make 
it possible to determine the chance of detecting differences in the expectations 

and mg of specified magnitudes. Finally, a discussion of some uses of the 
test has been added . 

In conclusion we should like to express our thanks to Prof. E, S. Pearson 
for his helpful suggestions and criticism made during the course of our in- 
vestigation.* 

+ [Certain modifications and additions to the paper have been made since it was received for 
publication at the beginning of July 1939. As circumstances have unfortunately made com- 
munication with the authors impossible, I must accept responsibility for these alterations, which 
have been mainly concerned with the extension of Tables I and II to higher values of n and fi. 
E. S. P.] 
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TABLE J 


Boimclary values, k(n, a), for each n 


n 

Upper limit of significance level, a 

11 

Upper limit of significance level, a 

“ *1’ H’g 

0-20 

O' 10 



=.t:i + ;E2 

0-20 

O' 10 

0'05 
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1 
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11 

2 
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12 
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16 
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12 

4 
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13 

5 

0 
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16 
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13 
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0 
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46 

18 

16 

■9 
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1 

0 

0 


47 

18 . 

17 

■9 

14 

8 

1 

1 

(1 

0 

48 

19 

17 

16 

14 

9 

2 

1 

1 

0 

49 

19 

18 

17 

15 

10 

2 

1 

1 

0 

60 

19 

18 

17 

16 

11 

2 

2 

1 

0 

51 

20 

19 

18 

16 

12 

3 

2 

2 

1 

52 

20 

19 

18 

16 

13 

3 

3 

2 

1 

63 

21 

20 

18 

16 

14 

4 

3 

2 

1 

64 

21 

20 

19 

17 

16 

4 

3 

3 

2 

66 

22 

20 

19 

17 

16 



3 

2 

56 

22 

21 

20 

17 

17 



4 

2 

57 

23 

21 

20 

18 

18 

6 


4 

3 

58 

23 

22 

21 

18 

19 

6 

5 

4 

3 

59 

24 

22 

21 

19 

20 

6 

6 

5 

3 

60 

24 

23 

21 

19 

21 

7 

6 

6 

4 

61 

24 

23 

22 

20 

22 

7 

6 

6 

4 

62 

26 

24 

22 

20 

23 

7 

7 

6 

4 

63 

25 

24 

23 

20 

24 

8 

7 

6 

6 

64 

26 

24 

23 

21 

25 

8 

7 

7 

5 

66 

26 

25 

24 

21 

26 


8 

7 

6 

66 

27 

25 

24 

22 

27 

9 

8 

7 

6 

67 

27 

26 

25 

22 

28 

10 

9 

8 

0 

68 

28 

26 

25 

22 

29 

10 


8 

7 

60 

28 

27 

26 

23 

30 


10 

9 

7 

70 

29 

27 

20 

23 

31 

11 

10 

9 

7 

71 

29 

28 

26 

24 

32 

11 

10 

9 

8 

72 

30 

28 

27 

24 

33 

12 

11 

10 

8 

73 

30 

28 

27 

26 

34 

12 

11 

10 

9 

74 

30 

29 

28 

26 

35 

13 

12 

11 

9 

76 

31 

29 

28 

26 

36 

13 

12 

11 

9 

76 

31 

30 

28 

26 

37 

14 

13 

12 

10 

77 

32 

30 

29 

26 

38 

14 

13 

12 

10 

78 

32 

31 

29 

27 

39 

15 

13 

12 

11 

79 

33 

31 

30 

27 

40 

_ 

16 

14 

13 

11 

80 

33 

32 

30 

28 


Rule of test: reject hypothesis that «ij = mj if Zi^k{n, a) or x^'^n~k{n, a). 


See note on p. 323 regarding extension of limits for n>80. 
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Note regarding Table I. When !Cj and a;, are large, if the' hypothesis tested is true then 
[Xi—Xi)j'\/{Xi+x^) may be regarded as a unit normal deviate or, from another point of view, 
should be normally distributed about with, standard deviation iV %. On this basis, in the 
ease of ft = 80, the values of k(n, a) for a=0-20, 0-10, O'OB and O'Ol are found to be 34'3, 32-6, 
31'2 and 28‘6, respectively. Applying a correction for discontinuity we have 33, 32, 30 and 28 for 
the integral values of h(ri, a.) associated with upper limits of significance level equal to the four 
values of at. These are the values for h{n, a) given in the last row of Table I. It is clear that 
beyond »= 80 the normal approximation is vali(J. 







ON THE METHOD OF PAIRED COMPARISONS 
By M. G. KENDALL and B. BABINGTON SMITH 


Inteoduction 


1. Suppose we have a number of objects A, B, 0, etc. which are to be 
considered according to the different degrees in wliicli they exliibit some 
common quality. If the quality is measurable in some objective way the 
objects will yield a number of variate values, in which case the problem is 
amenable to treatment by well-understood methods. It may, however, happen 
either for theoretical or for practical reasons that the quality is not measurable. 
We then have to rely for a discussion of the variation of the quality on judgments 
of a more or less subjective kind carried out after a comparison of the objects 
among themselves. 

One of the methods of comparison which has been widely used in this 
connexion is that of ranking. An observer examines the objects and arranges 
them in the order in which he judges them to possess the quality under con- 
sideration. This arrangement is called a ranking, and when two or more observers 
provide rankings of the same set of objects there arise the familiar questions of 
the type: is there any significant resemblance between the judgments of 
observers? or, do the data furnish any evidence that the objects have a “real” 
objective ranking? 

2. The ranking method suffers from a serious drawback when the quality 
considered is not known with certainty to be representable by a linear variable. 
We may, for instance, ask an observer to rank a number of individuals in order 
of intelligence, and he may comply with the request in the full belief that he 
is doing something within his powers; but if mtelligence is not measurable on 
a linear scale this ranking may fail to give a real picture either of the observer’s 
preference or of the variation of mtelligence among the individuals. It is not 
impossible that the observer should judge A more intelligent than B, B than 0, 
and 0 than A, if the individuals are presented for his consideration one pair at 
a time. The likelihood of this happening is obviously increased when we are 
dealing with tastes in music, eatables or film stars; and in practice the event 
is not uncommon. Such “inconsistent” preferences can never appear in ranking, 
for if A is preferred to B and B to G, then A must automatically be shown as 
preferred to G. The use of ranking thus destroys what may be valuable infor- 
mation about preferences, 

"3. In this paper we consider a more general niethod of investigating pre- 


ferences. With n objects, we shall suppose that each of the 


n 

2 


possible pairs 


is presented to an observer and his preference of one member of the paij- noted. 
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We assume that a choice between two objects can always be made.* With 
m observers the data then comprise preferences. The questions to be 


discussed include : 

(a) Is there any evidence that a particular observer is capable of forming 
a reliable judgment of the quality under investigation; and if not, is the fault 
his, or is it due to the fact that he has been asked to perform an impossible task? 
{b) Is there any significant concordance of preferences between observers ? 
(c) Can the quality under discussion be represented by a linear variable? 

4. The method of offering for judgment objects two at a time is known as 
the method of paired comparisons. Hitherto it has been used mainly in human 
psychology, but it has some interesting applications in animal experimentation. 
For instance, in feeding experiments it is impossible to get an animal to rank 
a number of foods in order of preference but it, is not difficult to offer pairs of 
foods and to note which is taken first. Experiments of this kind have, of course, 
to be conducted with great care to ensure that conditions operating when the 
different pairs are offered are as constant as possible; but the difficulties are 
far from being insuperable and the method of paired comparisons offers a useful 
technique in cases where the more usual procedures cannot be applied. From 
the point of view of theoretical statistics perhaps the most interesting part of 
the present work is that it offers some lines of approach to the difficult question 
whether a given quality can be legitimately regarded as based on a linear 
variable, i.e. whether ranking or scoring methods are justifiable or not. 


Consistence in peeferences 


If the object A is preferred to B we write A->B or B^A. The 



preferences of a single observer may be represented in tabular form as shown 
in Table I. 

In this table, which is shown for the six objects A to F, an entry of unity 
in column Y and row X means X-s- F, and is thus accompanied by a comple- 
mentary zero in row F and column X, The diagonals are blocked out. For 
example, in Table I, A->B, A~>C, D-^A, etc. 

The arrangement of the objects to F in the row and column headings is 
quite arbitrary. There are (n\)^ ways of representing the same configuration of 
preferences in such a table according to the permutations of objects in row and 


* That is, wc exclude cases in which an observer cannot make up his mind which object lie 
prefers, just as in the ranking case one excludes the possibility of split ranks. In practice it some- 
times happens that an observer is genuinely unable to reach a decision. To allow for this fact in 
the theoretical discussions would introduce complications of a most intractable kind. When the 
effect becomes important in practice it can be allowed for by selecting the set of preferences which 
are most unfavourable to the hypothesis under test. 
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TABLE I 



A 

B 

G 

D 

E 

F 

/I 

— 

1 


0 

1 

1 

B 

0 

__ 

0 

1 

1 

0 

a 

0 

1 
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1 

1 

1 

D 

1 

1) 

0 

— 

0 

0 

B 

0 

0 

0 

1 

— 

1 

F 



0 

1 

0 

— 


column; but in practice it is generally desirable to have the order in row and 
column the same, and even among the n ! possible arrangements so given there 
are often practical considerations which determine one order as more convenient 
than others. 

6. Paired comparisons may also bo represented geometrically by a method 
which can be illustrated for the case of the six objects as follows : 



Kg. 1. Geometrical representations of the scheme of preferences of Table I. 

We represent the six objects A to J by the six vertices of a regular hexagon 
and join the vertices in all possible ways by straight lines. If A-^B we draw 
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an arrow on the line AB pointing from A to B. The arrows shown on Fig. 1 
correspond to the preferences shown in Table L* 

7. If an observer makes preferences of type A-^B-^-G-^A we say that the 
triad ABG is inconsistent. In the geometrical representation an inconsistent 
triad is shown by a triangle in which all the arrows go round in the same 
direction. We may thus speak of a “circular” triad of preferences. In Fig. 1 
the triads ACD, BEF and three others are circular, 

It is also possible to have inconsistent triads of greater extent; but any 
such circuit must contain at least two circular triads. Suppose, for instance, 
that ABQB is circular, e.g. that A '~^B —^G 1)~-^A , Then either A~~^G or G’^A, 
In the first ease AGD is circular, in the second ABO. Similarly either ABD or 
BGD is circular. Thus the circular tetrad must contain just two circular triads. 
On the other hand it is possible for a tetrad to contain circular triads without 
being itself circular. 

Similarly, if ABODE is circular either ABG or AODE is circular and either 
BCD or BDEA is circular. If the two tetrads are circular there must be at least 
three circular triads (not necessarily four, because ADE may be common to 
both). It is easy to see by an actual example based on this configuration that 
there need not be more than three circular triads ; and it is clear that there must 
be at least three. For if the tetrads are not circular then ABG and BGD must 
be so and then either ODE is circular or ABGE is so, adding at least one more. 

Generally, it appears that a circular n-ad must contain at least (n--2) 
circular triads ; but it may contain more, and the fact that an n-&d contains 
(w-2) circular triads does not mean that it is itself circular. In discussing 
inconsistences, therefore, it seems best to confine attention to circular triads, 
which, so to speak, constitute the inconsistent elements of the configuration, 
and to ignore the more ambiguous criteria associated with circular polyads of 
greater extent. 

8. We now prove the following theorems : 

(1) The maximum possible number of circular triads is (w®~'a)/24 if n is 
odd and (71^- 4n)/24 if n is even; and the minimum number is zero. 

(2) These limits can always be attained by some configuration of pre- 
ferences. 

(3) For any integral number .between the maximum and the minimum 
there exists at least one preference-configuration with that number of circular 
triads ; and in general there will be more than one. 

Consider a polygon of the type shown in Fig. 1 with n vertices. There will 

* These preferences were obtained in an experiment on a dog, which was ofiered the following 
foods iii pairs; meat, biscuit, chocolate, apple, x>ear and cheese. The members of a pair were out 
to the same size and placed equidistantly from the dog, which was then released and allowed to 
choose. All the pieces of food were eaten avidly, it being that sort of dog, but there were considerable 
inconsistences in choice. We do not offer these data as more than an illustration of the 
method. 
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be (n- 1) lines emanating from each vertex. Let a,;,, be the number 
of lines at the respective vertices on whicli the arrows leave the vertex. 

n /fi\ 

Then {a,) = L 1 

and the mean value of is (■«- 1)/2. 

Define T = .9 


^/.S{a;) . 


.( 1 ) 


We now show that if the direction of a preference is altered and the effect 
is to increase the number of circular triads by d, T is reduced by 2d; and 
conversely. Consider the preference A^B. The only triad.s affected by altering 
this to are those containing the line AB. Suppose there are a preferences 
of type A~>X (including A-^B) and /? preferences of typo B~>X. Then four 
possible types of triad arise : 

A-yX<-B, say /> in number 
A<-X->B, 

A -y X ~>B, which must number a-p-l 
A<- X<-B, ,, ,, 


When the preference A~yB is reversed the first two remain non-circular, 
The third becomes circular, the fourth ceases to be so, The reduction in the 
vlueof^i, 2 


■=2(a-//-l) 
= 2(1, say. 


Tlie increase in the number of circular triads is 


(a-p-l)-(/?-/j) =«-/;- i 
= (L 

More generally, if as the result of reversing any number of preferences T is 
decreased by 2d, then d must be an integer and tlie number of circular triads 
must be increased by d, This clearly follows from the previous results for the 
reversal of preferences can take place one at a time and the effect on T and the 
number of oircular triads is cumulative. 

We now investigate the maximum and minimum values of T. It is clear 
from the definition that T is greatest when the a’s are the natural numbers 
1, 2, ..., a; and this is a possible case because it corresponds to ordinary ranking. 
Hence max, (T) = {n^-n)ll2. 

For the minimum value, consider the polygon A^, A^, A,^. Set up the 

preferences A^-yA 2 -^...A,^-yA^, Clearly at any vertex this re.sults in one arrow 



M. G. Kendall and B. Babington Smith 


329 


entering and one leaving the vertex, i.e. the contribution to a is unity at each 
vertex. Next set up the preferences . This circuit may either 

visit each vertex once, or not. In the latter case we proceed to an unvisited 
vertex and set up the preferences so on. Again there 

will be a unit contribution to all the a’s. 

We then set up the preferences etc. and so on; and in this 

way we shall ultimately complete the preference scheme. 

If n is odd all the preferences deseribed will consist of circular tours of the 
polygon, and thus the value of a for each vertex will be (n— 1)/2, If n is even 
the last preference Al-^A^^J^.l will not be a tour but will consist of the single 
line joining one vertex with the symmetrically opposite vertex. Thus there will 
be w/2 vertices for which a = n/2 and w/2 vertices for which a = (w- 2)/2, In 
this case T = nji. 

Now it is clear from the definition of T that it cannot be less than zero, or 
if n is even, be less than -ft/i. The configuration just given shows that these 
minima are, in fact, attainable. 

Thus T can vary from a maximum of (?i®-7i)/12 to a minimum of zero 
or w/4. Hence the maximum number of circular triads, being half the variation 
from maximum to minimum of T (the maximum of T corresponding to the 
ranking case in which there are no inconsistences) is (?i®-4n)/24: if w is even 
and w)/24 if n is odd. 

This establishes the first two results enunciated at the beginning of this 
section. To prove the third it is sufficient to give a systematic method of 
proceeding from the configuration of minimum to that of maximum incon- 
sistence by steps decreasing T two at a time. Consider, for example, the case 
n = 8. For the minimum inconsistence the a’s will have the values 0 to 7, which 
we set out thus : 

ABODEFOH 
0 1 2 3 4 5 6 7 

We proceed by reversing the preferences between vertices whose a-values 
differ by two. This clearly reduces T by two. 

Reversing the preferences between 0 and E we get 

ABODEFOH 
0 1 3 3 3 5 6 7 

and between D and F we get 

ABODEFOH 
013434 . 6 7 

which we may rearrange as 

ABOEDFOH 
0 1 3 3 4 4 6 7 
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Now reversing the preferences between B and E and between D and G and 
rearranging we have 

ABE GFDQH 
0 2 2 3 4 5 5 7‘ 


and now interchanging A and B, G and H, 

A B E G F D G H 
1 1 2.3 4 5 () 6 


At this stage we have preserved the a-numbera 2, 3, 4 and 5 in the middle 
but reduced the extremes A and H. We can now carry out the process again, 
arriving at the a-nurabers. 

1 2 2 3 4 5 5 () 

and twice again, giving 

22334455 
whence a final interchange gives 

3 3 3 3 4 4 4 4 


and this is the position of maximum inconsistence. It is readily verified by 
following the interchanges on a polygon diagram that the reversals are, in 
fact, legitimate. 


Coefficient of consistence in paired comparisons 


9. If d is the number of circular triads in an observed configuration of 
preferences we define 


2id 

n odd 

II 

I 

i 

n even 


•( 2 ) 


and call ^ the coefficient of consistence. If and only if it is unity there are no 
inconsistences in the configuration, which may therefore be represented by a 
ranking. As ^ decreases to zero the inconsistence, as measured by the number 
of circular triads, increases. 

For example, in the configuration of Fig. I there are five circular triads, 
ABD, AOD, AFD, AED and BEF. The maximum possible number is 8. 
Thus ^ = 0’376. 

10. ^ can also be interpreted in the light of Table I. Suppose, in that table, 
we sum the rows. (The column sums are determined by the row sums and 
add no fresh information.) The sum of any row will be the a-number for that 
vertex in the polygon which corresponds to the object defining the row. T will 
then be the value of the sum of squares of deviations of row totals from the 
mean value («. - 1)/2, that is to say, will be the variance of the row sums multiplied 
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by n. ^ is thus a linear function of this variance; but it cannot be tested in the 
distribution as if Table I were a contingency table, for the border cells are 
not independent or linearly dependent. 


H. If an individual observer produces a configuration of preferences which 
show inconsistence there are usually several explanations; he may be an 
incompetent judge, the objects may be so alilce that consistent differentiation 
is not possible, or his attention may wander during the course of the experiment. 
We discuss these questions later. They are mentioned here to explain the motive 
for the next stage of the mathematics. With what probability can a value of 
^ arise by chance if the observer allots his preferences at random with respect 
to the quality under consideration? 

With n objects there are 2'^'^ possible configurations of preferences. We 


(n\ 

proceed to investigate the distribution of d in this universe of 2''^' different 
members. The method consists of proceeding from the distribution for n to 
that for {n + l). 

For n = 3 there are eight configurations, of which two give one circular 
triad and six no circular triads. Consider the effect of adding a new vertex D 
to the vertices ABC. Four cases arise: 


(1) all A, B, 0. 

(2) 1)^ two of A, B, 0. 

(3) D-^ one of A, B, 0, 

(4) none oi A, B, C. 

The last two are symmetrical with the first two and need not be separately 
considered. 

Situation (1) arises in one way and clearly does not add any new circular 
triads other than those already existing in' the configuration ABC. It therefore 
contributes six values d = 0 and two values d = 1. So does situation (4). 

Situation (2) arises in three ways, according as D-^A, B, or 0. The con- 
figurations so reached are similar and we may take any one, say &a the 
single preference. If A-<r-0 then DAG is not circular and if B-^G the DBG is 
not circular. On the other hand A-^C and B-^G will each produce a circular 
triad. We then have the cases 



No. of circular , 
triplets added 


0 

A-*G-*B 

I 

A*-C^B 

1 

A-^C-^B 

2 
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We now consider AB. In the first two cases just enumerated the direction 
of AB does not matter and no circular triads are added. With the third A~^B 
gives no circular triad but A^B adds one. With the fourth A->B adds one 
and A^B adds none. 

Thus the number of circular triads occurring for these four cases is found 
to be 


No, of oiroolar 
triplets 

Frequency 

0 

2 

1 

2 

2 

4 


We must multiply the frequency by three and by two to allow for similar 
symmetrical arrangements, and the final results are 


No. of circular 
triplets 

Frequency 

0 

24 

1 

16 

2 

24 

Total 

64 


The principles of this method are clear enough and the work may be 
formalized by a number of conventions which we omit to save space, In common 
with many similar combinatorial problems, however, troubles arise from the 
sheer number of possibilities and the difficulty of ensuring that nothing is 
overlooked. Up to the present we have found the distribution of d for n up 
to and including 7. The frequencies and probabilities are given in Table 11. 

12, For the values already obtained the moments are given by the following 
formulae : 

/4 (about 0) = ^|gj, (3) 



We have very little doubt that these results are true in general but can 
offer no rigorous proof. In so far, however, as the moments are in a sense 
symmetric sums it appears highly probable that they are given by polynomials 






in n\ and if this is so the values obtained are sufficient to establish polynomials 
of degree six or less. 

It is also to be noted that from the above values of the moments 
A = . y?2 = /*4//t|~3 + 12/w, 

from which it appears that a Type III distribution would fit the d-distribution 
fairly closely for moderate or large values of %. But as the distribution of d is 
of interest mainly for low values of n, which are all that occur in practice, it 
hardly seems worth while attempting to fit a curve. 


Agreement among several observers 
13. We now consider the investigation of similarities of judgments for 
m observers. Suppose that in a table of the form of Table I we enter a unit in 
the cell in row X and column Y whenever Xh- 7 and count the units in each 
cell. A cell may then contain any number from 0 to m. If the observers are in 


complete agreement there will be cells containing the number m, the 

remaining cells being zero. The agreement may be complete even if there are 
inconsistences present. 

Biometrika xxxi 2* 
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Suppose that the cell in row X and column 7 contaiiiB the number y. Let 




the summation extending over the n{n- 1) cells of the table (the diagonal cells 
being ignored). X is then the sum of the number of agreements between pairs 
of judges. Put ny 

u = (S) 

\v\v 


The maximum number of agreements, occurring if j contain m, 

( 7l\ i'i7t\ 

2 1 1 2 j and thus in the case of complete agreement, and only in this case, 

M = 1. The further we go from this case, as ineasiired by agreements between 
pairs of ob.scrvers, the smaller u becomes. The minimum number of agreements 
occurs when each cell contains ?a/2 if m i.s even or {m + l)/2 if m is odd. That 
is, if is oven, the minimum number of agreements is 


42/(2) ■*’“*’“'■•*'(2 


and in this case 


When m is odd the minimum value of u is found to be 


14. We propose to call u the coefficient of agr'eement, It is unity if and only 
if there is complete agreement in the comparisons. Its minimum value is not - 1 
except when m = 2. This, however, is to be expected in a measure of agreement 
for there can be no such thing as complete disagreement among three or more 
observers in paired comparisons. If observer P differs in certain comparisons 
from observers Q and li, the two latter must agree on those comparisons. 

When m = 2, w reduces to 



and X becomes twice the number of cases in which the two observers agree 
about a comparison, u is thus a generalization of a coefficient t proposed by 
Kendall (1938) to measure the correlation between two rankings. For general 
m, if the entries in the table were constrained to the ranking type, u would be 
the average interoorrelation t between observers taken two at a time. 
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15. In discussing the significance of u it is desirable to know whether the 
set of preferences which give rise to it could have arisen by chance if the 
preferences had been assigned at random with respect to the quality under 
consideration. The procedure which first suggests itself is a generalization of 
the method used for the case of rankings {Kendall & Babington Smith, 1939). 
That is to say, we sum the entries in the rows of the table and consider the 
variance of these entries. If the preferences are allotted at random we expect 
to find about equal numbers given to each object, and the variance will be low; 
in other cases it will be higher. 

The difficulty about this suggestion is that it has hot been found possible 

to ascertain the distribution of the variance in the possible sets of pre- 
ferences. The case m = 1, corresponding to the distribution of d for inconsistences, 
is difficult enough to solve. For higher values of m we have failed to find any 
distributions except in trivial cases. 

We can, however, offer a test based on the distribution of u (or Z). The 
comparative simplicity of the distributions in this case is in accordance with the 
remark made by Kendall in the paper under reference that the distribution 
of r is much simpler and much more regular than the distribution of the 
Spearman correlation coefficient p. 

16. Consider one cell in the table in row X and column Y and let it contain 
the number y. Then the corresponding cell in row Y and column X will contain 

m-y. Thus these two contribute to Z the amount + 

Now, of the total ways in which the units can be distributed in the first cell 

( 77h\ » » • 

j in which y units occur. Consequently the distribution of Z 
in the cell and the corresponding cell is given by the expression 




+ 




A 






(?) 


( 12 ) 


and since the distribution in other pairs of cells is independent if the preferences 
are allotted at random the distribution of Z for the whole table is given by 


n(Z)=f^, (13) 

where N = 

17. The distributions have been worked out for the following values of m 
and n: m = 3, n = 2 to 8; m = 4, = 2 to 6; TO = 5, w = 2 to 5; m = 6, n = 2 to 4. 

Tables III to VI give the probabilities based on these distributions, i.e. the 
probabilities that a given value of Z will be attained or exceeded. 

For constant n the distribution tends to the Type III form as to tends to 
infinity. In fact, for a single pair of related cells the variate value corresponding 

22-2 




TABLE III 

The probability P that a value of S unU he attained or exceeded, for m = 3, n-- 


S P 

S 

P 

X' 

P 

X 

P 

1 1-000 

3 

1-000 

fi 

1-000 

10 

1-000 

3 -260 

6 

■678 

8 

■822 

12 

•944 


7 

■156 

10 

■480 

14 

•766 


9 

■016 

12 

■169 

10 

•474 




14 

■038 

18 

•224 




10 

■0048 

20 

•078 




18 

■0^24 

22 

•020 






24 

•0036 






26 

•0a42 






28 

■0«30 






30 

•0“95 


2H 1-000 
30 1-000 


42 -672 

44 -400 


m -068 

52 -029 













TABLE V 

The probability P that a value of Z will be attained m exceeded, 
for m = 5 and n = 2to 5 



TABLE VI 


The probability P that a value of Z will be attained or exceeded, 
for in = 6 and n = 2to i 
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to a frequency is + ( 2 )’ quadratic, in y. Were llie 

variate value a linear function of y the distribution for the single cell would 
tend to normality in accordance with the well-known property of the binomial. 
The case of the quadratic value corresponds to a transformation of the variate 
of the type x^ = y and the transform of the normal form exp ( - a:®) dx becomes 
the Type III form exp(- 2 /) Since the N cells are independent and the 

sum of variates in the same Type III form is also distributed in that form, it 

d-x 

follows that r is in the limit distributed as exp {~E) dS except perhaps 
for some constants. Thus E nr some multiple of it is distributed as 

For constant m the distribution tends to normality with increasing n, 

18. The first of these results suggests that the Ty])o III distribution will 
provide an approximation to the distribution (13) when m is moderately large. 
We proceed to find the first four moments of (13). 

It is sufficient to find the first four moments of (12), those of (13) being 
obtainable therefrom in virtue of the relationships which connect scmin variants 
of independent distributions. 

The rth moment of (12) about the origin is given by 




4)V 




since 2"^ is the total frequency. Thus we have 


...(14) 


VI 

2>; = -S' , , 

r=n U 




( 1Yt\ 

^ IfP can be obtained by operating on the binomial (1 + 3 :’ 


p times by e.g, we fijicl 


d' 


m 


Im 

U. 


ni 




m , 1 hti 




2 ' 2\2 


and hence, substituting in (15), 


/ l/m\ 

^^ = 42)- 

Thus the mean of the distribution (13) is given by 


.( 10 ) 
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In a similar way we find 

( 17 ) 

(18) 



Sm^- 15m + 17 
' ““8 


+-^N{m^ — in) 


( 19 ) 


These are the moments of £. Those of u are obtained by dividing by an 

( Wl\ 

2 I and it may be noted in particular that the mean 

of u is zero. 

We have directly from (17), (18) and (19) 

Nm{m-l)’ 

*>)■ 

For constant m, as N-^cc, 
and for constant N, as m-^co, 

8 12 


confirming the tendency towards the Type III distribution. 
19. The first four moments of the Type III distribution 

dF = ke~^^ dx 


are 


1 L ^ 

p> pZ’ p3’ pi 


Equating 
we find 


the second and third moments to those given by (17) and (18) 


Nm{m— 1) 
2(w-2)^’ 


( 20 ) 


2 




m — 2' 


( 21 ) 


To make the first moments correspond we move the origin of the I’ dis- 
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( tyhx Tth 3 

— - to the riglit. We thus reach the approximation 

2i f L 

to the 2’ distribution, coinciding in the first tliree nioments 

to ■ 

(lF=^ke iK- 2 a! 2 (m-if (lx, 


wliere 


a: = 2- 



m — 3 
w-2 


or, transforming to the more usual form- by Pitting x^ 
that 


I ‘ 




is distributed as x^ with 


Nm{m- 1 ) 

(7/J,— 2)2 


4a;/(ffl:- 2), we find 
( 22 ) 


(23) 


degrees of freedom. 

The fourth moments of 2 and the x^ approximation differ by terms of order 
N~''- and ?n“’' compared with their absolute values. 

20. It only remains to be seen liow largo m and n must be for this to provide 
a satisfactory approximation. 

Consider first the distributions for m = 3. When = b, W = 28, we have, for 
the approximation, 42 distributed with 168 degrees of freedom, From Table III 
we see that for 2=64, P = 0*011 and for 2=58, P = 0*0011, Applying a 
continuity correction by deducting unity from 2 we find for the x^ approximation 
with ;\;'“ = 4x53, v=168, 2 = 0*011, and with 2 = 0*00114, The 

correspondence is very close, in spite of the low value of m. 

For m = 4, n = 5, iV = 10, the approximation gives 22-30 distributed with 
30 degrees of freedom. For 2 = 40 and 41, this gives, Avith continuity corrections 
of 0*5, half the variate-interval, ;^2 = 49 and 51, r= 30. From the diagram 
given in Yule & Kendall’s “Introduction to the Theory of Statistics” (1937) 
it is seen that these values lie one on either side of the 1 % value ; and this is 
in accordance with the exact values of 2, which are seen from Table IV to 
be 0*016 and 0*0088. Similarly we find that the values of 2, 37 and 38, lie on 
either side of the 5 % level, which is again in accordance with the exact values, 
2 = 0*060 and 0*038. 

For m = 6, w = 4, V = 6, the approximation gives 2-33*76 distributed with 
11*25 degrees of freedom. For 2= 69 and 60 the corresponding x^ values are 
seen to lie on either side of the 1 % point, w'hich accords with the exact value 
of Table VI. 

We conclude that the x^ approximation provides an adequate test of 
significance for the values of m and n outside the range for which Tables III-VI 
give exact values. 
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21. As a matter of theoretical interest we may record the results for the 
distribution of u when the data are ranked. It appears that in this case 

2n + 6 ^ {2n + 6f ]' 

m-‘227i^ + Qn + l\j' 2\2/\2jh S{m-2)2n^ + Qn+lj_ 

is distributed approximately as with 

(2) (^<(2) 

This result is not of much practical value. The case of m rankings can be 
more simply treated by other methods. 

InTEBPRBTATION 01? EESIILTS OB PAIRED COMPARISONS 

22. In the light of the foregoing theory we may discuss the interpretation 
of the results of a paired comparison experiment. 

If for each observer the coefficient of consistence is unity the comparisons 
reduce to rankings and may be discussed by known methods. But if some 
or all the coefficients are not unity we have to consider the following 
possibilities : 

[а) Some of the observers may be bad judges and the inconsistences reflect 
their shortcomings in making comparisons. 

(б) Some of the objects may differ by amounts which fall below the threshold 
of distinguishabiUty for some observers. 

(c) The property under judgment may not be a linear variate at aU and 
we may be getting the sort of confusion which would result if observers were 
asked to compare English towns according to the bivariate concept “ geographical 
position”. 

{d) Several of the effects may be operating simultaneously. 

23. If we have only one observer and have no prior knowledge of his 
capabilities it is not in general possible to apportion his inconsistences among 
these causes. Exceptions may occur when the inconsistences are of a marked 
and peculiar kind; for instance if they involve only four objects out of 15, we 
may suspect that the four are practically indistinguishable rather than that the 
observer is unable to make distinctions at all and avoided inconsistences among 
the others by sheer chance. But even here conclusions drawn a posteriori after 
inspection of the data are dubious. Table II gives a test of the hypothesis that 
an observer is incapable of making judgments. For example, with n = l, the 
chances are 983 in 1000 that if the preferences are made at random there will 
be more than two inconsistent triads, so that if we find two or less, it is 
improbable that the observer is completely incapable of judgment; We might 
then be led to suppose that his small deviation from internal consistence is 
due to fluctuation of attention, very close resemblance to the objects giving 
rise to the inconsistences, or both. 
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24. With m observers the investigation can be taken a good deal further. 
If all the observers show inconsistences we suspect that the objects are at fault 
or that the observers are being asked to perform an impossible task. On the 
other hand, if most of the observers show a small or zero inconsistence we 
suspect that the othcr.s are just bail judges and may reject their data 
accordingly. 

As between indistinguishability of objects and non-linearity of variate, a 
choice of explanations would depend largely on the extent to which incon- 
sistences were concerned with the same set of objects. If there is a high value 
of «, indicating concordance of judgment, we expect to find most of the 
inconsistences confined to certain objects, and common to observers. In this 
case we suspect that the objects are close together in the degree to which they 
exhibit the quality under consideration. But if the observers scatter their 
inconsistences over the whole field u will be moderate or low and we suspect 
that the observers are being asked to do something beyond their ca]3acity ; and 
this brings us to question the validity of regarding the cjuality as a linear variable. 

26. When a quality such as “bravery” or “intelligence” is insusceptible 
to measurement there is frequently doubt of this kind. But this has not 
deterred investigators from assuming that such statistical variables exist, or 
from requiring observers to rank objects according to them, or in some cases 
from replacing such rankings by quantiles of the normal curve. We are never 
tired of criticizing this Principle of the Hypostasis of Plausible Terminology. 
Hitherto it has flourished largely because of the difliculty of adducing evidence 
against it ; and we hope that the inconsistence of paired comparisons will provide 
a criterion, however rough, of the legitimacy of the methods to which it leads. 

But we would emphasize that our approach to the method of paired com- 
parisons has a somewhat different object from that elaborated by Thurstone 
(1927 and many subsequent papers). As we understand it, his method is appro- 
priate where one is entitled to assume a priori or by reason of precautions taken 
in the selection of material that a linear variable is involved and that there exist 
perceptible differences between the items presented for comparison. Our object 
is to make it possible to dispense with such assumptions and precautions. 

20. A few words may be added about the case in which an objective order 
is known to exist (as, for instance, in judging individuals according to age or 
weight). In such circumstances the appearance of inconsistences will indicate 
unreliability of the part of the observer or subliminal differences between 
objects. A measure of the observer’s reliability may be obtained by calculating 
u between known and observed comparisons. If ^ is high enough to enable us 
to accept his judgments as internally consistent on the whole, « may still be 
low enough to reject his judgments as accurate. 

27, We conclude with an example of the application of the foregoing theory 
to some experimental material. 
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Classes of children (ages U to 13 inclusive) were asked to state their 
preferences with respect to certain school subjects. Each child was given a 
sheet on which were written the possible pairs of subjects and asked to underline 
the one preferred in each case. Two classes gave the following results; 

(a) 21 boys, 13 school subjects. The preferences are shown in Table VII, 
wliioh is in the form described in section, 13; e.g. there were 18 boys who 
jireferred Art to Keligion. 

TABLE VII 

Preferences of 21 hoys in 13 subjects 

I 2 3 4 .5 fi 7 8 9 10 11 12 13 Total 

1. Woodwork 

2. Gymnastics 

3. Art 

4. Soienoo 

5. History 
(i. Geography 

7. Aribhrnetio 

8. Keligion 

9. English Literature 

10 . Commercial subjects 

11. Algebra 

12. English Grammar 

13. Geometry 


The calculation of £ for this table, in which the objects are arranged in 
order of total number of preferences, may be shortened by noting that £ as 
given by equation (7) may be transformed into the form 

where the summation now takes place over the half of the table below the 
diagonal. Since the numbers in this half are smaller than those in the other 
half there is a considerable saving in arithmetic. 

We find ^1 = 9718 



14 

20 

15 

15 

16 

16 

18 

18 

18 

20 

21 

20 

211 

7 

— 

14 

12 

13 

18 

14 

16 

16 

20 

10 

18 

19 

183 

1 

7 

— 

10 

14 

10 

16 

18 

16 

16 

17 

16 

19 

160 

6 

9 

11 

— 

11 

12 

15 

14 

13 

13 

17 

17 

16 

164 

6 

8 

7 

10 

— 

14 

11 

12 

14 

15 

13 

14 

16 

140 

5 

3 

11 

9 

7 

— 

14 

14 

13 

13 

16 

15 

17 

137 

0 

7 

5 

6 

10 

7 

— 

9 

11 

13 

15 

13 

15 

116 

3 

5 

3 

7 

9 

7 

12 



12 

14 

14 

16 

14 

116 

3 

.5 

5 

8 

7 

8 

10 

9 

— 

10 

13 

13 

15 

106 

3 

1 

5 

8 

6 

8 

8 

7 

11 

— 

10 

10 

14 

91 

1 

.5 

4 

4 

8 

5 

6 

7 

8 

11 

— 

10 

13 

82 

0 

3 

5 

4 

7 

6 

8 

6 

8 

11 

11 

— 

13 

81 

1 

2 

2 

5 

5 

4 

6 

7 

6 

7 

8 

8 

— 

61 

I Totivl 

1638 


and hence 


u - 


2x9718 



1-0-186. 


There is thus a certain amount of agreement among the children, indicated 
by the positive value of u. Is this significant? 

We note first of all that this distribution of preferences could not have 
arisen by chance to any acceptable degree of probability. In fact, = 412.4 
(equation (22)) and v = 90-7. The large value of v justifies the use of the normal 
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approximation to the distribution and we find ^{2x ^) = lb'3 a very 
improbable result on the hypothesis of a random allocation of preferences. 

The distribution of circular triads was as follows; 


No. of triads 

Erequency 

No. of triads 

Erequency 

0 

1 

12 

1 

1 

1 

17 

3 

4 

6 

21 

1 

(i 

2 

25 

1 

7 

2 

29 

1 

8 

i 

39 

1 

10 

1 

Total 

21 


The total number of circular triads was 242 with a mean of 11 ‘5, Only 
one boy was entirely consistent, On the other hand, for ?i = 13 the maximum 
number of circular triads is 1)1, with a mathematical expectation of 71'5. It is 
thus clear that, except perhaps for one boy, we cannot suppose that any boy 
allotted preferences at random. We are again led to conclude that the boys 
are genuinely capable of making distinctions, and that consistently on the 
whole. Half the boys have coefficients of consistence ^ greater than 0-l)2. 

We conclude that the boys can make preferences and that in their view the 
subjects are sufficiently different to enable a reasonably consistent sot of 
preferences to be made. So far as these data are concerned we would see no 
objection to the assumption that a scale of preferences can be set up. With 
this in mind we can say that the value of % indicates a certain amount of 
agreement, though not a strong one, between the boys as to which subjects 
they prefer. 

(b) 26 girls, 11 school subjects. Table VIII shows the data. 

TABLE Vm 

Preferences of 26 girls in 11 subjects 

1 , 2 3 4 fi 6 7 B l) It) n total 


1. Gymnastics 

2. Soionoo 

3. Art 

4. Domestic Science 

6. History 

6. Arithmetic 

7. Geography 

8. English Literature 

9. Eeligion 

10. Algebra 

11. English Grammar 



10 

19 

17 

20 

17 

21 

21 

21 

18 

22 

180 

16 

— 

12 

16, 

17 

16 

21 

10 

18 

16 

17 

105 

6 

13 

— 

10 

16 

18 

10 

17 

16 

19 

16 

147 

8 

10 

9 

— 

16 

11 

13 

15 

14 

11 

14 

121 

6 

8 

9 

9 



14 

18 

12 

13 

16 

18 

121 

8 

10 

7 

14 

11 

— 

12 

13 

12 

16 

18 

121 

4 

4 

16 

12 

7 

13 


14 

16 

14 

14 

112 

4 

0 

8 

10 

13 

12 

11 


14 

13 

14 

106 

4 

7 

9 

11 

12 

13 

10 

11 

— 

11 

17 

106 

7 

9 

6 

14 

10 

9 

11 

12 

14 

— 

12 

104 

3 

8 

9 

11 

7 

7 

11 

11 

8 

13 

— 

88 

Total 

1376 
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We find S = 8928, u = 0-082. 

For the significance test, x^ = 180-3, v - 62-4 and ,j{2x^)->j(2v~l) = 7-9, 
as before a very significant result. 

The distribution of circular triads was 


No. of triads 

Frequency 

No, of triads 

Frequency 

1 

2 

1.7 

1 

2 

2 

19 

1 

3 

1 

22 

1 

4 

1 

23 

2 

6 

1 

27 

1 

8 

1 

• 32 

1 

9 

1 

35 

1 

11 

2 

37 

1 

12 

2 

38 

1 

13 

1 


— 

14 

1 

'Total 

26 


The total number of circular triads is 382 with a mean 15-28. For w = 11 
the maximum number of circular triads is 56 with an expectation of 41*25. 
Several of the girls come very close to this, the worst having a coefficient of 
consistence equal to 0-31. 

We are, however, again led to conclude that the preferences were not 
allotted at random and that most of the girls are capable of exercising a judgment 
which is on the whole consistent. There is only a very slight agreement in 
preferences. 

Thus the girls are less consistent and less alike in preferences than the boys, 
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THE MEAN AND VARIANCE OF WHEN USED AS 
A TEST OF HOMOCENEITY, WHEN EXPECTATIONS 

ARE SMALL 


By J. B. S. HALDANE, F.R.S. 


Peaeson’s measure of divergence, can be used not only as a test of goodness 
of fit, but as a test of homogeneity when the expectations arc unknown before 
the samples are observed. Consider the (m x w)-foId table: 



®12 

®13 

‘ 



®22 

®23 

... a2„j 

h 

^h\ 

%2 

^33 

••• ®3m 

h 




^n2 


• ’ • '^nm 

h 

C _ 


h 

• • • Li 

N 


Here each represents a number of individuals observed, and 6^ = a^j, 

i 

Ij = 'ZO'ip N = = Sfy. The table may represent samples of s^, 

i i j 

individuals, each sample falling into m classes, .... C, being the grand totals 
in each class. Or it may represent m samples of C, < 2 » • • • . individuals, each falling 
into n classes, the class totals being s^, s,^. In each case we ask what is the 
probability of so bad a fit if every sample is taken from the same large population. 


The expected value of is clearly and 





Fisher (1022) showed that when every s.^ and was sufficiently large, lias 
the usual distribution, with (m- 1) {n- 1) degrees of freedom. Thus its mean is 
(m-l)(?r-l), its variance 2(m-l)(?i-l). Haldane (1937, 1038, 1939) investi- 
gated the exact values of the moments of x^ when expectations are small, in 
(m X w)-fold tables with rm or m{n- 1) degrees of freedom, but did not attempt 
the present problem. This had previously been done by Cochran (1936), in the 
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restricted, case when m = 2, and every is equal Cochran used a method based 
on the use of characteristic functions. My own method is entirely elementary, and 
completely exact, though rather tedious. My results are very close to Cochran’s, 
but the accurate values are worth putting on record. 


The mean value oe 

Given the marginal totals, the probability of any particular distribution is 

ij 

And this is so whatever may be the true expectations. Thus, if in the sample 
the expectation of the observation is s^pp so that that of tj is Npp then the 
probability of the given distribution is 

■ 

a 

But the i)robabiIity of the given marginal totals is 

'MiT' 

7 

Hence P is the probability of obtaining any particular distribution with the given 
sample sizes s^ and class totals tp 

If P(a:) denotes the sampling expectation of cc, then P («{,•) = P Paip summation 
being taken over all samples possible with the given marginal totals, Hence 

(a, -!)!(«, .-l)!n(^J)n(«i!) 

PK-) = ^ S -72y_i)!(«,,^-5)!rn^ ’ 

kl 

where k assumes all positive integral values between 0 and n except i, and I all 
positive integral values between 0 and m except j. That is to say the quantity 
summed is the probability of a set of observations similar to those of the table 
except that and have been diminished by unity. Hence the sura equals unity. 


riJ ~ N{N~l) ’ 




N(N-iy 


Similarly 
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and so on. Hence 


The mean and variance of 




£P~/i(a, /jl-J 

'i^izMizR+i 


«L ^“1 
1 


■N 


N-lti 




= — — p {N^ - nN ~ mN + mnN) - N 
(m- l){n-l)N 


N~l 


( 1 ) 


a result previously obtained by Bartlett (1!)37). 

This clearly becomes (m-l)(w-l) when N tends to infinity. In the case 
where m = 2, and every S; = s, we have 


X 


2 


(«.- l)ws 
ns - 1 


( 2 ) 


Cochran gave 


= n-l + 


ns—s + l 
ns* 


,5 — 1 

This exceeds (2) by —4— — rr . S'^d is tlierefore correct to 0(n"^). It will be noted 
ns*(ns- 1) 

that the mean depends only on the number of degrees of freedom and the grand 
total N. The higher moments involve the marginal totals and Ip and are 
therefore more complicated. It is clear that the expectation is diminished because, 
in sampling from a finite group of N individuals, E{aif) has the same value as in 
sampling from an infinite population with the same frequency, but E{alj) is less. 


Thiu vabianoe in a (2 X 2t)-ifoim table with equal samples 

Owing to the complexity of the general expression for the variance, we shall 
first calculate it for the (2 x n)-fold table: 



aj 

^3 • 

■ a,n 

A 

h 

h 

6g . 

■ K 

B 

S 

8 

S 

. S 
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Here n samples of s are taken, and each falls into two classes, say healthy 
and hi diseased. The totals are A and S, and = ws = H + J5. It will be con- 
venient to put A = pN, B = qN. 


X 


2 _ 





Sdi- 


nps 


m 

'm ‘ q 


Now 


(Saf)2 = Sof + 2Sa!af. 

i tj 


summation being taken over all unequal pairs of values of i and j. 

Hence 

E{Ea\f‘ = nE{a\)-Yn{n-l) E{a\d^) 

= nE[ai{ai - 1) (a^ - 2) (% -i) + - 1) (a^ - 2) + 7ai(a,j - 1) + a^] 

+ n{n-l) E\ai{ai ~ 1 ) a^{a^ _ i ) + 2a< (a^ - 1 ) -t a^a .] 

= - 1) (^ - 2) (^ - 3) + - 1) - !)=>] 


l)is-'2) + '2n{n~ 1)] 

Since N =■ ns, A = pns, we have 


JL 

pns 


E{Ba\f 


(H-l)(H-2)(H-3) 

{N ^\){N ~zy 


1) [?i«^ ” (n- -f 4) s -f 6] 


^ (iV-l)(A^-2) 


(5-l)[(»-h2)5-6] 


+ [(^ + 6)s — H + 1> 




pnh^ 


• 1) (ns — 2) (ns — 3) 


[pVs^{s - 1) {ns“ - (n -t 4) s -I- 6} 


+ 2pH{n- l)5(s-l)(ns — 6) 

4- p{n — 1) {n(n -t 1) s® — 2(3n - 4) s - 2} — 2(n - 1) (s - 1)]. 


Biometrika xxxi 


23 



350 The mean and variance of 

So the variance of is given by 


) = 


2\2 


'(w-l)7is fns 

_ (w5" 1) q . 


Ejlalf n^s%pn{s - 1 ) +n- 1]^ 


(j'®(ns- 1)2 


n^s 


pq^(ns - 1 )® {ns - 2) (ws - 3) 

+ 2|)®w(w- l)s(5- l)(ns-6) 


[(ns - l){2jVfi(5- 1) (ws®-?i“4s+ 6) 


+p{n~ 1) 1 . 9 ^- 6?!,-8s-2)-2(?i- 1) {s- 1)} 

- ps{ns - 2) (ns - 3) [pn{s - 1) + » - 1}*] 

2n®(n- l)s(s- 1) 


p^{ns - I)'* [ns - 2) [ns — 3) 


[pHh^ - 2pVs^ + p[n\s^ + ns - 1 ) - (ns - 1 )] 


^*(n - 1) s(s - 1) [pqnh^ - ns + 1) 
pq[ns - 1 (ns - 2) (ns - 3) 


2n\n- l)s^[s-l) / ns-U 

~ (ns- l)”(ns-2)(n5-3)\ A£ J 

2[n-l)F^[N-n) / N-l] 

[N-l)^[N-2j[N-3j{^ Ab)' 


This becomes 2(n- 1) when A and B tend to infinity. 

When A and B are both of order N, that is to say neither p nor q is small, 
it becomes 


2(n-l) l-|n+^-7jw-i 


+ 0[N-^). 


Cochran’s expression is 2(n~ 1) (l-nW-^) + 0(iV-®), which is accurate when 
J21 , 

P = i ± U ) i-e. 0'8273 or 0T727, It will be seen that the variance is diminished 

when n > 7 - which is certainly the case if n > 3. If .4 remains finite when N 
and B tend to infinity we find ^ = n - 1, 


F(x^) = 2(n-l)(l-i), 


a formula already given by Haldane (1937). 

Formula (3) may be compared with the values given by Haldane (1937) for 
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of freedom, In the terminology of this paper 

F=«. 

KCYn = 2M + ”(~-6]. 

s\pq I 

iSiimhirly the limiting case given above may be compared with the values for 
cl ??.-fold table with n degrees of freedom, namely, 

f = n, 

ViX^) = 2n + ~. 


The variance foe a (mx?i)-FOLD table 

Ihis of course includes the variance found above as a sj)ecial case. However, 
as the summations involved are somewhat hard to follow, the simpler algebra 
for the special ease has been given separately. 



Here summation over the values of i, k on the one hand, and j, I on the other, are 

independent and commutative. S(*) = SrS(®)l) where k assumes all values 

ifc i tic J 

from 0 to n inclusive, except i, '^{x) has a similar meaning. 

}i 

Thus S(l) - n, 2K') = iV', {l)j = 2pi + and so on. 

Let S = SsrL T = Diri. 

i j 

Then 

4 = a«(ay - 1 ) (ay - 2) (ay - 3) + 6ay(ay - 1 ) (ay - 2) + 7ay(ay - 1 ) + ay , 

f l) + ay(ay — 2)a)^^ + aya;^j.(aj,j'~ l) + ayaj,y 

and so on. The expectations of the two middle terms in the last expression are of 
course equal. Remembering that 


iKK, - 1 ) (»„ - 2) (a„ - 3)] = ■Ar(ir-l)(y-2)(iV-3) ' 


and il[ay(ay-l)a*^(a;^^-l)] 


_ ~ f ) ~~ f ) ~ f ) (^I ~ ^) ~ 3) 

N{.N-l)(N-2){N-i) 


23-2 
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and so on, we have 


{Si~ 1) (^i“2) (5{— i 


■2)(i;j-3) 


X V G?.i-I)(^fe-l)«j-l)(^j-2) (<j~ 3) 


+ ^ ( 5 ,_ 1) 1) (i 1) 

m im 

. 1 r«^(.9,.-i)(5i~2)o,.-i)(i^-2) 

'^N(N~l)(N-2}Cf 

+ 2S + 2S 


+ 2E(«. 




r, 

¥-i)L 


7lKJUk^2}+-^kJ^^fi^+ 2^(1) 

ij ikj ijl ikjl 


]"s?(4)- 


Hence 


N~^E[{xHNf] 

^ - 6«i + 1 1 - esj- 1) + S (Si - 1 ) (Sft - 1 ) 

X rE(«|- 6iy+ 11 - + S (ij-1) (ti- 1) 

Li ji 


+ S(«1 - 3 + 2si^) + S(5,- 1)) {s(t^ - 3 + 2tf^) + S(L - 1 )) 
u ifc ) u jj )_ 

+j^j [6S(1 - 5^-') 2(1 - if^) + { 2 ( 1 - + 2(1)) 

X {2(1 + 2(1)11 + ^ 2(5^■ ') 2«r') 

^ pr^l)(iV-2) +nHl0n--6)S] 

X [jy»- 2(m + 2) y + m»+ 10m- 67] + ^^ 

x[2(iV-3n + 2^) (N--Sm + 2T) + {nN-n^-2n + 2S}(mN-m^~2m + 2T)] 
+ - »S) (w - T) + (w^i - iSf) (m2 --T)] + ST, 
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N-\N - 1) {N - 2) (N- 3) E[ix^ + N)^] 

= [N^-2(n + 2)N + nHm-^S][N^-2{m + 2)N+mHlOm-%T] 

+ 2(iV-3) [{nm + 2)N^ 

-{nm{n + m) + inm + 6{n+m)-2(n+2)T-2{m + 2)S}N 
+ nV + 2nm{n + m) + 22nw - 2{n^ + Sw) ^ - 2{rnP‘ + 8m)S+ 12ST] 

+ {N - 2) {N~ 3) [n^m^ + &nm-{rfi+U)T -{m^ + Qm) 8 + 18T] 

+ {N-1){N~2){N-3)ST 
= N^+2{a-^-2)N^ + [(a-^f-6a+&^ + 4:]m 
+ ( - 3a® + 8ay? - 4/S2 + 6a - 4/?) + a2 - 2a/? + 4a - [(%2 + 2w - 2) 7 
+ (m® + 2m - 2) S] + [(w® -2n)T + (m® - 2m) 8]N + N^{N +l)ST, 
where a = nm, ^ = n+m. 

But + 


Hence 


N{N+a-j3) 

N-1 


N-^N -1)^(N-2)(N~3) V(x^) 

- (i^-l)[iV'H2(a-/?-2)]\r®+etc.]-i^(i\^-2)(i^-3)(i\r + a-/?)® 
= 2{a-^ + l)N^ + ia.H2^-4:)N^-2{a^-a^+/3Ha-2j3)N 
— a(a - 2/? + 4) - [(w® + 2n - 2) T + (m® + 2m - 2) ^S] N^(N - 1) 

+ [( 71 ®- 2w) T + (m® - 2m) S] N{N~ 1) + STN^{m-l). 

Thus 

N 

= {N-l){N-2){N-3) 


■2(w - 1) (m - 1 ) iV® + (w®m® + 27!. + 2m - 4) ® 

-2{7im(77- l)(7n-l) + (7H-m) (w + m-2)}iV-7im(77-2) (m-2) 

X 1^ ™ — 

- {(ti® + 271 - 2) T + (m® + 2m - 2) iV® 

+ {7^(7^ - 2) T + m(m - 2) JV + -S'riy®(i^ + 1 ), 


where S and T have been defined on p. 361 above. 
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When there are n samples of .Sj, s^, S 3 , members, fallinff into two classes 

N 

whose totals are A and 5, we have m - 2, T - <9 = 2 so 


F(x“ 




{N-lf{N~2y{N-i) 


2(n-l)iV* + 2n(2n+l)iV-67i2 


- ^N{N -l)£sy ^+ ^ ^ [N{N + 1 ) 2sr 1 - {n^ + 2n - 2) W + n(n - 2)}"^ 

( 6 ) 


When every S; = «, this reduces to formula (3). In the case of the fourfold 


table 


F(X“) 


a 

"b 

c 

d 


, it becomes: 




{N-\){N-2){N-'d) 
1 


■2(W2 + 10W-12) 


-6iV*- 


; + r 


I 


W-1 


+ , 


m{N+i) 


{a + b){c + d) {a + c){b-¥d)l (o + 6)(c^• d)(a + c)(6-l*d) 


. ..,( 6 ) 


Discussion 

The expressions for the higher moments would clearly be a great deal more 
complicated. The above calculations have, I think, a twofold interest, They show 
that the loss of one degree of freedom arises from the fact that we are sampling 
from a finite, and not an infinite, aggregate. And they |)oint the way towards 
an exact treatment of the problem of curve fitting, for which Pearson originally 
designed the measure x^- lo the case of a (w x 2 )-fold table with (n - 1 ) degrees 
of freedom we are, in effect, asking whether the observation.s give a .satisfactory 
fit to a horizontal straight line y = k, where 1 / is the olwervod frequency of a type 
within a sample. If we were trying to fit a line y = kf lx, we should have n~i 
degrees of freedom, x having a different value for each sample. If we were trying 
to fit a normal curve we should have a- 3 degrees, and so on. -leffreys ( 1 !) 3 H) 
has pointed out the great difficulties of curve-fitting in such cases when expecta- 
tions are small. It is clear that the expected value of in such a case is not 
exactly a "3. It may turn out to be slightly greater. Thus an investigation of 
the actual law of error in a particular type of observation will demand an extension 
of the present investigation to ca,se3 where several more degrees of freedom are 
lost. 
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Editorial ffolc, The expression (3) of p. 360 above for the variance of f in a (2 x n)'fold 
table corresponds to that given by B, L. Welch in 1938 (“On tests for homogeneity," 
Biomtrik 30, 188, equation (28)). E. S. P. 



A NOTE ON THE STATISTICAL ANALYSIS OF SENTENCE- 
LENGTH AS A CRITERION OF LITERARY STYLE 


By C. B. WILLIAMS, So.D. 

Department of Entomology > Rothamsted Experimental Station 

Some years ago I made a number of caloulatioris of the frequency distribution 
of words of different length in different books to see to what extent authors 
kept to a definite distribution and so perhaps might be identified by such a 
method. The results obtained, however, were not striking and the work was put 
at one side. 

Mr Udny Yule (1939), however, has attacked the problem of authorship 
from the angle of the variation in sentence length, and this appears to be a 
much more fertile method of approach. 

Mr Yule shows that the frequency distribution of sentence length (i.e. 
number of words between successive full stops) is of the skew type and by 
comparing in two different manuscripts, the mean, the median, quartiles and 
deciles he is able to produce convincing mathematical evidence on the identity 
or otherwise of their authorship. 

Mr Yule does not comment on tiie skew distribution further than to state 
(p. 371) “they are not of the Poisson type, but of the type in which the square 
of the standard deviation largely exceeds the mean”. 

When I converted some of Yule’s tables into diagrams I was struck by their 
general resemblance to certain skew distributions with which I have recently 
been dealing in some Entomological problems, and which distributions, I found, 
became normal and symmetrical if the logarithm of the number was taken as 
a basis for subdivision into groups instead of the number itself (see Williams, 
1927). 

I was unable to test this transformation on Yule’s figures as he unfortunately 
does not give the original data, but only the word length of sentences in groups 
of five; so it was necessary to obtain some new data. 

These 1 obtained by counting the number of words in each of (100 sentences 
from the following three books: 

(1) G. K. Chesterton, A Short History of England, 1917. 

(2) H. G. Wells, The Work, Wealth and Happiness of Mankind. 

(3) G. Bernard Shaw, An Intelligent Woman's Guide to Socialism. 

All three works deal with the exposition of somewhat similar sociological 
subjects and none of them are in the “conversational” style. 

The selection of the sentences was randomized as follows. Each of the books 
is divided up into chapters, sections or both. In Chesterton’s book the first 30 
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sentences were counted in each of the first 20 chapters. In Wells’s book the 
first 10 sentences were counted in each chapter subdivision up to chapter vii, 
division 11. In Shaw’s book the first 15 sentences in each of sections 1-40 were 
taken. In each case the greater part of the book was covered. 

The original data thus obtained are shown diagrammatically in Fig. 1. 
Each of the distributions is of the typical skew type obtained by Yule : Shaw is 
the most extreme and varies from 3 words to 143; Wells is less skew and ranges 
from 3 to 91 ; while the Chesterton curve is the least skew and varies from 6 to 91 
with only two values over 60. 

From Table I it will be seen that the arithmetic mean number of words per 
sentence is 25-87 for Chesterton, 24-11 for Wells and 31-23 for Shaw. The 
medians are also different and presumably the quartiles and deciles, but these 
latter were not calculated. 


TABLE I 

Frequency constants of the distributions of sentence length 



Chesterton 

Wells 

Sliaw 

Number o£ sentences 

Number of words 

Arithmetic mean no. of words 
Median no. of words 

Mean log no. of words 

Geometrical mean no. of words 
Standard deviation of mean log 
Standard error of mean log 

600 

16, 521 

26-87 

25-3 

1-.37 

23-5 

0-200 

0-0080 

600 

14,463 

24-11 

20-8 

1-31 

20-6 

0-237 

0-0095 

600 

18,736 

31-23 

26-0 

1-39 

24-5 

0-290 

0-0112 

- 


If, however, instead of taking the frequency distribution of the actual 
number of words per sentence we take that of the logarithm of the number we 
get the distributions shown in Figs. 2-4. They undoubtedly show a very close 
resemblance to the "normal distribution”. The mean log and standard deviation 
for Chesterton is 1-37 + 0-20; for Wells 1-31 ±0-24 and for Shaw 1-39 + 0-29. 
The standard error of the mean is, owing to the large number of observations, 
in all cases very small and approximately +0-01. 

On each of the three figures is superimposed a normal curve of the same 
area, mean and standard deviation and it will be seen how closely it fits the 
observed values. 

The following comments may, however, be made : 

(1) The greater irregularity of the observed values in the lower portion of 
the distribution is due to the irregular distribution of the logarithms of integers 
when grouped in small artificial divisions as in the present case. Thus there is no 
logarithm of an integer between 0-01 and 0-25; none between 0-61 and 0-65 
and between 0-71 and 0-75. On the other Ijand, there are two between Ml 
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and M5; two between Mt) and 1-20; only one between 1'21 and 1-25 and three 
between 1-26 and 1-30. Thus in all three diagrams the frequency of 1-21-1'25 
is well down and that of l"26-l-30 is far up, iSome process of smoothing would 
undoubtedly eliminate these irregularities, but it was thought better to leave the 
data in their original form and draw attention to the sources of irregularity. 

(2) There appears to be on all three diagrams a slight shortage of high values 
and a slight excess of low ones. On the latter I have no comment but it appears 
possible that the small deficit in the longer sentences might easily be due to a 
biased effect introduced by the habit that many writers have of cutting up 
unusually long sentences into their component parts when reading over their 
manuscript or proofs. 

On the assumption that the normal curve is a sound representation of the 
frequency distribution of the log number of words in sentences, Fig. 5 has been 
prepared which shows the means and normal curve.s for the three books super- 
imposed on one another. The means are close together but the distributions are 
very different. 

The difference in moans between iShaw and Wells is O-OO and the standard 
error of the difference is only 0-0 15. Thus the difference is six times the standard 
error and hence certainly significant. Between Shaw and Chesterton the 
difference of the means is barely significant but that between the standard 
deviations is quite striking, 

If the above reasoning is correct, it is unnecessary for the comparison 
between two documents to compare arithmetic means, medians, quartiles and 
deciles, but only the log mean and the log standard deviation; all other com- 
parisons are included in these. 

It follows also that Mr Bernard Shaw, while undoubtedly under the im- 
pression that he was punctuating at his own free will, was for this particular 
book hide-bound within the limits of 

1 , r(l-4-.r)n 

^ 0'29/(27r)®’^^^L2{b-29fJ’ 

while similarly Mr Wells was writing under the restricting influence of 

1 HM-xr 

where Z is the frequency and x the logarithm of the number of words per 
sentence. 

It is also perhaps worthy of passing comment that the curve representing 
Mr Shaw is short and broad while that representing Mr Chesterton is tall and 
slender ; which shows how necessary it is to use these curves only for the purpose 
for which they were originally designed. 

Perhaps something might be added on the meaning in words of the above 
mathematical transformation. If the log distribution is normal we can infer 
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that the extent to which a sentence in the process of writing is likely to vary is 
at any level proportional to the length of the sentence. Thus when he is thicklng 
in short sentences of about 10 words an author is likely to vary say from 8 to 
12 words ; when he is thinking in longer sentences of say 100 words he will vary 
from 80 to 120. In other words the variations are proportional or geometric 
and do not merely involve the addition or subtraction of x words at all levels. 
Further, if the geometric mean is taken as a basis, sentences between this and 
half its length are as frequent as those between it and twice its length ; sentences 
down to one-quarter its length are as likely to occur as sentences up to four 
times its length; and so on. 

If the arithmetic mean were the true basis then sentences of 10 words more 
than the arithmetic mean would be as likely to occur as sentences of 10 words 
less, and it is easy to see that this is not the case. 

Before the whole theory of the use of such distributions for separating works 
of different authorships can be fully accepted it will of course be necessary to 
study the results obtained from many different works by the same author, in 
different styles, on different subjects and at different periods of his life. From 
these it may be possible to find what variation can occur “within authors” as 
compared with “between authors”. This note is not meant to deal with this 
basic problem but only to draw attention to the simplification of the method 
of approach to such a problem by the use of a transformation which produces 
a normal instead of a skew distribution. 
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APPLICATIONS OF THE NON-CENTRAL i-DISTRIBUTION 

By N. L. JOHNSON and B. L. WELCH 


1. Introduction 

Statistical problems arising in connexion with the normal distribution are 
simplified by the circumstance that the sample mean x and standard deviation 5 * 
are jointly sufficient estimates of the corresponding population parameters ^ and 
a, Also the distributions of it and s have simple forms and together with Student’s 
i-distribution provide complete solutions of any problems of testing hypotheses 
or of estimating fiducial limits relating to either ^ovcr singly. In the present paper 
we shall consider some of the questions which arise when our main concern is not 
with either ^ or cr alone, but with some function of the two. In particular we shall 
consider a number of cases which all lead to the use of what has been called the 
non-central /-distribution. Tables of the probability integral of this distribution 
will be given in a form suitable for the solution of the problems raised. 


2. The non-central /-distribution 

Let z be a quantity distributed normally about zero with unit standard 
deviation and let u) be a quantity distributed independently as x^lf, where /is t'he 
number of degrees of freedom of the Then if / is defined by the equation 

z-pj 


/= 


a/w ’ 


( 1 ) 


where S is some constant, then / is distributed in a manner depending only on d 
and/. This distribution will be termed a non-central /-distribution. When $ equals 
zero we have the familiar Student’s /. 

In general the elementary probability distribution of / is given by 

1 






r I, 

/ \i!/+l) 


‘'I ..V 

dv 


/! 


where 


Hhj{z) = 




\f+iV 


Hh 


iL \ 
Kf+t’)}' 


!'■ 




( 2 ) 


( 3 ) 


An account of some of the properties of the above-defined function has been 
given by R. A. Fisher (1931). He derives a result equivalent to the above equation 
(2) although with a different notation, his / being equivalent to our tj^J and his 

♦ 5* = r(a:-s)*/(?i-l), where n is the sample size. 



363 


N. L. Johnson and B. L. Welch 

T being our ^/^(/+ 1). Tables of the EAfunction have been calculatedbyJ.E.Airey 
(1931). These tables, however, are not useful for solving the problems which are 
considered in the present paper. For these problems the probability integral 
of the non-central i-distribution is required. Existent tables of this integral 
(J . Neyman, 1936 and J. Neyman & B. Tokarska, 1936) have been calculated only 
for one rather restricted purpose. In the present paper much more extensive 
tables will be provided which, it is hoped, will cover all the apphcations of the 
non-central f-distribution likely to be encoimtered. 

The probability integral of non-central t demands a table of triple entry, since 
the probability that t exceeds say, depends on/, d, and tg. The notations 

P(f,S,h) = P{t>to\fJ) (4) 

will be used to denote this probability. 

Often it is necessary to find what value of is such that the probability (4) 
will take a specified value e, say. This will be a function of/, d, and e and it will 
be convenient to denote it by t{f,S,e). Thus 

Again , often ij will be given and the value of S which makes (4) take the value e, 
will be required. It will be convenient to denote this value of S by S(f, e). Thus 

P{t>to\fJ{f,h,e)} = e. 

Space does not permit the tabling of all the three functions P(/, d, t^), t(f, S,e) 
and 8{f, e). For reasons which will become clear later in the paper it was decided 

to table S(ft e) most fully. Table IV, given at the end of the paper, facilitates the 
direct calculation of S{f, e) for seventeen probability levels e. It can also be 
used without difficulty for calculating t{f, 8, e). Table V is an additional short table 
from which t{f,8,e) can be calculated rather more directly but only for the 
probability levels e = 0* 05 and e == 0-95. These tables are given in the most suitable 
form which we have been able to evolve consistent with necessary hmitations on 
size. 

Before the tables are described, the next three sections will be devoted to a 
description of some of the situations where they are required. 

3. COBEFIOIENT OF VAEIATION 

The first function of ^ and o- to be discussed wiU be the coefficient pf variation 
V = cr/^. In the practical situations where this index is an appropriate measure 
of variability, the variable x is usually necessarily positive. Now for a normal 
population the ratio of the mean to the standard deviation has to be of the order 
of 3 or more, for the chance of a negative x to be negligible. Strictly speaking, 
therefore, the sampled population should not be assumed normal if the coefficient 
of variation is too great. The figure 33 % has often been stated as the permissible 



364 


Applications of the 7ion~ccntral t’-distrihutio7i 

upper limit. In practice coefficients of variation are uHually much smaller than 
this, and it is assumed in the present section that we are dealing with such cases. 

An estimate of V is provided by the sample coefficient of variation v ~ sjx. 
Now since we may write 

V s I cr rr j ' (T 


it appears from comparison with ( 1 ) that .fnlv is distributed as non-central t with 
f={n~l) and d = ^njV (McKay, 1932). The solution of problems relating to V 
is therefore easily effected. 

For instance, suppose it is desired to test whetlier the sample contradicts the 
hypothesis that Fq, and that we decide to reject the hypothesis when v > v^, 
where Wg is chosen so that P{v > iig | F = Fg) is equal to some specified small chance 
e. In the notation of the previous section Wg will then be given by 


Jn 


t\n- 


’F„’ 


Again, consider an example where it is decided to reject a sample as unsatis- 
factory when V is greater than a given value Ug, and where it is required to know 
how low the true coefficient of variability should be kept to ensure that the 
probability of rejection will not exceed a given e, i.e. we require Fg such that 
P{v > ?;g I F = Fg) = e. In the notation of Section 2, Fg is given by 


■fn 

>0 





(V) 


Or finally suppose that a value of v is observed and an upper fiducial limit of 
V is required so that the chance is e of this limit being exceeded, i.e. a lower 
fiducial limit of ^JnjV is required. Now since 

|V«. 


pp->t 


( T 

p- > 


= e 


and since the inequality 




>t{n 


F ’ 


‘ 1 , TT 1 S 


is equivalent to the inequahty* 


^jn 

V 


<d 




) 


it is seen that the required upper fiducial limit of F is 



( 8 ) 


* This follows from the fact that tlf, d, e) is a monotonioally increasing function of 6, 
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4. The power of Student’s ^-tbst 

Suppose it is desired to test the hypothesis Hq that the mean ^ of a normal 
population has the value Student’s t-test consists in calculating from the data 

Uie quantity ^ ^ > {i-j,) 

$ 

and referring it to the usual central f-diatribution with / = (w- 1) degrees of 
freedom. Thus if the only alternative hypotheses to he taken into consideration 
are those for which ^ > ^q, the test will consist in rejecting the hypothesis when 
t>to, where is such that P(i> io | Ha) = e, and e is the conventional level of 

significance. In our notation ^ 

<0 = i(w-l, 0, e). (10) 

Now when jffo is not true, hut ^ has some alternative value it is often required 
to know how powerful Student’s test will be, i.e. what chance the test will have of 
rejecting Hq. But we can write (9) in the form 

^ ^ ^ . s 

whence, comparing with (1), it now appears that the quantity calculated from the 
data is distributed in the non-central i-distrihution with / = (w- 1) and 

S = > 

The power of the test is therefore value of 

for which the power reaches any specified value say, is given by 

‘tikzM^S{^,ta,v). ( 12 ) 

(T 

Tables for evaluating the power of the <-test together with a discussion of their use 
have been given by J. Neyman (1935) and J. Neyman & B. Tokarska (1936). 
They are not restricted to the simple case of testing whether a mean has a 
specified value but apply to all cases in which the i-test is used. 


5. Proportion defective problems 
Another class of problem where the non-central f-distribution has an applica- 
tion, occurs when objects are classified as defective or non-defective according to 
whether they have values of a characteristic exceeding or falling short of a fixed 

standard. Thus, if anobject is defectivewhen the character a: exceeds afixed given 

level L, then information will often be required about the proportion P falling 
heyond L in the population. If the population be normal P clearly depends only 


on the ratio 


{L-^) 
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An estimate of U is provided from the sample by calculating 

{L-x) 

u = - 

s 


( 13 ) 


Corresponding to deviate u the equivalent normal probability p will then give 
an estimate of P, This situation has been discussed recently by W. J. Jennett & 
B. L. Welch (1939).* The transference from P to P and from m top does of course 
require that the sampled distribution is really approximately normal and for 
this reason care must be taken in going out to the “tails” of the distribution 
(i.e. when U is large and P small). 

In order to allow for the sampling errors arising from this method of estimating 
P we have only to note that (13) may be written 


■^nu = 


I XP-g) Xs- g) 

or cr 


s 


(14) 


Hence ^nuis distributed as non-central t with / = (w- 1) and S = U. Thus if 
the proportion P (and hence U) wore known the value Wq of u such that P (u >uf}-e. 

would be given by 

^nuQ = t{n-l,^nU,e). (15) 


Conversely, given u from a sample, a lower fiducial limit of U is obtained by 
noting that the inequality 

e) 

is equivalent to the inequality 

A^nU <S{n-l,fnu,e). (16) 

Hence a lower fiducial limit for U will be given by ^ (a ~ 1 , fn n, e)jAjn. In the above 
analysis the proportion falling beyond L has been termed a “proportion de- 
fective”. In industrial problems this is often a fair description. More generally 
the analysis refers to any problem where we are primarily interested in how a 
measurable object is related to a fixed level. A slightly different problem occurs 
when we are not given L but are asked to estimate the value of L so that P will 
have an assigned value. Bor instance, the value of h may be required so that the 
chance is 1 in 20 of it being exceeded. 

Now if P is the normal deviate which is exceeded with probability P, then the 
required L is related to ^ and cr by the relation 


P = g-bP(r. 

An estimate of L is therefore given by 

l = x+Us. (17) 

* Por a general discussion of similar problems arising in the control of industrial products the 
reader may be referred to the British Standards Institution Publication, No. 600 (E. S. Pearson, 
1936). 
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If fiducial limits are required for L they may be obtained by returning to equations 
(13) and (15). L is not now given, but U is definitely known and therefore (15) 
provides the value of such that P{u>%) = e. But the inequality 

L—x 


u 


->Ma 


can be reversed to read 
Hence, if we write 


L>x-\-UaS. 


1 __ st{n-i,^nU,e) 


.(18) 


we shall have P{L > l^) = e. Hence wfil be a fiducial limit for L, which will be 
exceeded by L in a proportion e of cases. 


6. Large sample results 

Before going on to discuss the distribution of non-central t in general and the 
method of applying the tables given below, it will be convenient to consider the 
situation when/ is large. The distribution then approaches the normal form. The 
important question is how quickly. 

The first three moments of t are given exactly by the following expressions ; 


IH 

f\ 




n 


8, 


.(19) 


/*3 


(/- 2 ) 
r /(2/-3-P^^) J 
“^l(/-2)(/-3) 


( 20 ) 


■( 21 ) 


if - m - 

If the gamma functions are expanded in powers of 1// and only the leading 
terms are retained, these give approximately 


= S] /tj = 1 + 


’2/ 




3 + 


¥ 




L^'^2/, 


-i 


.(22)’t 


In large samples, therefore, t becomes normally distributed about 8 with standard 

deviation |l + The rapidity of the approach to symmetry is indicated by the 

true values of in Table I. For given/ the greatest value can take is shown 
in the last column. 

* The term 5^jf is retained because in most problems 5 will be of the order //. 


24-2 
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TABLE I 

VAi ^ different f and S 


/\. 

\ 

0 

1 

2 

3 

CO 

4 

0-00 

3-70 

4-61 

4'86 

S-09 

6 

O'OO 

1'67 

2-13 

2'26 

2-38 

8 

O'OO 

1-20 

1-65 

1-65 

1-74 

12 

O'OO 

0-84 

MO 

M8 

1>25 

24 

O'OO 

0-53 

0-69 

0-74 

0-79 

00 

O'OO 

0-00 

0-00 

O'OO 

O'OO 


It is seen from Table I that values of of the order of unity are likely to 
occur fairly frequently in the type of problem which has been discussed in the 
previous sections. This represents a considerable degree of skewness and another 
method of making normal approximations, outlined below, is preferable. Before 
describing this we may note that if t is referred to a normal scale with moan S 


and standard deviation 



then the following approximations to the 


quantities defined in section 2 will be obtained: 


= S+K 




( 23 ) 


where is the deviate of the unit normal curve exceeded with a probability e. 
Also S(f, tg, e) will be given by the solution of the equation in 8 


to ~ d -{-K 



( 24 ) 


On rationalization this equation is quadratic in 8 and the correct root to take will 
be obvious. 

Better approximations, which are, however, still based on the normal curve 
may be obtained as follows. The probability that f exceeds a given value is 
by (1) the probability of the inequality 


Jw 


( 25 ) 


and this inequality is equivalent to 


{-z+tQfv))<8. ( 26 ) 

Now ( - z) is a unit normal deviate, and ^jw, being of the form xi'ff is very nearly 
normally distributed even for small/. Since z and fw are independent, {-z + tf, fw) 
must therefore be more nearly normally distributed than fw, whatever to* 
* This is practically obvious, but a demonstration is given in the Appendix, equation (43). 
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Hence an upper limit to the skewness of ( — 2 + V'“') given for different / by the 
values of in Table II. Comparison with Table I shows that it is better to take 
{ — z + tf)^Jw)a,s normally distributed rather than t. The procedure is then as follows : 


Write 


mean^yM = a; (r^y,= 


m' 


.(27) 


Then { — z + t^ ^w) is taken to be normally distributed with mean at^ and standard 
deviation |^1 + j . Tor given the value of S such that P{t > tg) equals e is given 
by seeking d such that the probability is e of (26) being true. This gives 





(28) 


TABLE 11 

7/?! of for different f 


/ 

4 

6 

8 

12 

24 

00 


0'41 

0'32 

0-27 

0-21 



0-16 

o-oo 


Conversely for given d the value t{f, d, e) which will be exceeded with probability 
e, will be approximated by solving for t the equation 

+ . (29) 

On rationalization this becomes a quadratic in t. This method of approximation 
was given by W. J. Jennett & B. L. Welch (1939) together with a short table of 
a and b. However, since a and b differ from unity by a quantity of the order Ijf 
and since errors at least of this magnitude are involved in the assumption of 
normality, it would perhaps have been as logical to take a and b equal to unity 
in the approximation. The numerical comparisons set out in Table III indicate 
that nothing is lost by doing this and also show how inferior is the method of 
approximation which, assumes t itself to be normally distributed. 

Therefore the approximation which we shall adopt as being the best one, 
requiring only the use of the normal probabihty scale, will consist in assuming 
+ to be normally distributed about io with standard deviation 



The value of 8 such that p(t > tf} equals e will then be approximated by 
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TABLE III 


Vakm of 


/ 

e 

^0 

-Jf 

■vv/ 

True 

Approx. 1 

Approx. 2 

Approx. 3 

4 

0-99 

2 

0-056 

0-736 

-0-088 

-0-015 

4 

O-Ol 

2 

3-972 

12-023 

3-848 

4-015 

4 

0-99 

C) 

1-120 

2-668 

0-565 

0-720 

4 

O-Ol 

0 

9-247 

29-230 

8-840 

9-274 

24 

0-99 

2 

1-181 

1-345 

0-956 

1-178 

24 

0-01 

2 

2-819 

3-163 

2-80() 

2-822 

24 

0-99 


3-290 

3-677 

3-213 

3-255 

24 

0-01 

6 

6-752 

7-593 

6-087 

6-745 


The values correspond to certain /, <; and and have the property that i)[t>to\f, d) = e. 
Approximation 1 is tlio value obtained from equation (23), approximation 2 by using (28) with the 
correct a and l> and approximation 3 by using (30), i.e. taking a = ft = 1 , 


Conversely, for given the value t{f, S, c) which will l)e exceeded with ))robability 
e will be given implicitly by 


S = 



m 


which on rationalizing and solving becomes 



(32) 


The approximations (30) and (32) are useful even in moderate-sized samples. 
It is not intended, however, to enter here into any detailed discussion of how good 
the various approximations are, They have been considered primarily because 
they form the basis of the exact tables which are given at the end of the paf)or and 
which are described in the next section. 


7, Description and use of tables 

The tables are intended to facilitate the exact calculation of the quantities 
^(/i ^01 ®) ^(/) ®) defined in Section 2. The form in which they are presented 

was determined largely by the fact that the space they had to occupy was limited, 
and also to some extent by the method which was adopted to calculate them. 

Table IV may be used to give S(f, tg, e) for 17 probability levels, viz. 

e = 0'005, 0-01, 0-025, 0-06, 0-1, 0-2, 0-3, 0-4, 0-5, 0-6, 0-7, 

0-8, 0-9, 0-95, 0-976, 0-99 and 0-995. 
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The basis of the table is the large sample approximation described in the last 
section, equation (30). The in this equation is simply the normal multiple 
which is exceeded with probability e. In small samples an error will be committed 
in taking this form of approximation, and in Table IV we give instead a multiple 
A(/, e), which is such that (.xactly 

— • ( 33 ) 

Even for samples giving/ as small as i, it is found that A never differs much from 
K^. This stability makes interpolation easy so that, for given e, it is possible to 
provide a table of small compass covering all values of from -oo to oo. 

For given e the tabled quantity A is a function of / and and accordingly 
Table IV, for given e, is one of double entfy. 

Values are given corresponding to / = 4, 6, 6, 7, 8, 9, 16, 36, 144 and oo. The 
reason for choosing these values of/ is that A differs from by a quantity of the 
order l/^f. The values/ = 9, 16, 36, 144 and oo are such that 12/.// = 4, 3, 2, 1 and 0 
respectively. Hence interpolations for intermediate/’s may be simply effected by 
considering A as a function of 12///, since we are then dealing with a function 
tabled at equal intervals. 

The choice of the values of in the table is also determined by the necessity 
for interpolation to be simple and yet the table not to be unduly large. The 
whole range of from - oo to oo has to be covered. This has been done as follows. 
For to//2/ between - oo and - 0'75, A is given against the quantity 


y = 



m 


For fo//2/ between — 0'76 and 0-76, A is tabled against 



For t(,//2/ between 0-76 and oo, A is again tabled against y. The argument interval 
for both y and i/' is OT . It wiU be noted that y == {I- y'^f. 

From the point of view of interpolation other, perhaps simpler, functions of 
Iq could have been used in constructing the table. The choice of y was ihade 


because the quantity 


i+S 


is wanted in any case for substitution into equation 


(33) after A has been obtained, y' had to be used in addition because of inter** 
polation difficulties in the middle of the table. 

ItwiUbe noted thatin Table IV only the double entry tables for e = 0-005, 0-01, 
0-025, 0-05, 0-1, 0-2, 0-3, 0-4 and 0-5 are given. For e > 0-5 we can use the fact that 


= (36) 


a relation which is apparent from the form of equation (1). 
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To summarize, the steps necessary to find the value of 8 such that P{t > <„) = o: 
are as follows. 


(i) Find 2 /=(l + |j 


y' = icL ( I V"* 
pA W 


ac.cordinu as 


pf 


is greater than 


or less than 0'75. 

(ii) If/>!) calculate 12/.^/. 

(iii) If e is one of the values 0-005, . . . , 0-5, enter the appropriate part of Table IV 
and obtain A(/,io.K), by interpolating with respect to the quantities obtained in 

(i) and (ii). 


(iv) Calculate 






w ■ 


(v) If e is one of the values 0-0, 0-7, . . ., O-OOfi calculate d(/, i - e) and then 
change its sign. 

Inverse, use of Table IV 


Next consider the calculation of t{f, d, c), i.e. the situation where S is given and 
to is required so that P{t > 1^) = e. In the previous section it was shown that a 
first approximation will be given by equation (32). The true relation between 
<(/, d, e) and d will, however, bo obtained by replacing in (32) by the A which is 
tabled in Table IV ; thus 


tifJp:) 



(37) 


The drawback about this equation is that A is given in Table IV as a function of 
t„, and tg is not now known. An iteration method is therefore necessary. 

The successive steps in this iteration may be summarized as follows: 

(i) The first approximation is given by (32) ; thu.s 





(' 


JVi 

‘2/ 


(ii) Next use Table IV to calculate directly 


k= A(/,<i,e). 

(iii) A second approximation t^ is then given by substituting this Aj in (37); 
thus 
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(iv ) Then by finding = A(/, e) and substituting in (37) a third approxima- 
tion <3 will be obtained. These steps must be repeated until two successive 
approximations give the same value. In practice this is found to occur very 
quickly so that there is likely to be no need of more than three approximations. 

(v) If e is one of the values O' 6, 0-7, ..., 0'996 calculate i(/, 1-e) and then 

change its sign. 

Use of Table V 

The iteration process described above is not very lengthy in actual practice, 
but even this amount of trouble would be unnecessary if A were tabled as a function 
of d as well as a function of 1. Such a table, similar in extent to Table IV, could of 
course be calculated, but it did not seem worth while to do this, since the inverse 
method of using Table IV is not difficult. However, it has been thought useful to 
give for the single probability level e = 0-05, a table which can be entered with 
S, since very often this conventional 1 in 20 chance is taken as a point of reference 
in statistical problems. 

In Table V, A is tabled against 



at intervals of ()■ 1 from - 1 to 1 . When S is given, this function of 5 may be quickly 
calculated and then A may be obtained by interpolating in Table V. The substitu- 
tion of this value of A in (37) gives the required t(f, S, 0-06). As before i{f, 8, 0'96) 
may also be calculated from the same table, being equal to minus t{f, - S, 0'06). 


8. Examples 

In the present section some numerical examples will be worked to illustrate 
the application of Tables IV and V. Reference will be made to theoretical results 
obtained in Sections 3-5. 

Example I . In a sample of n = 25 a coefficient of variations = 2-6 is observed. 
Obtain from this a “median” estimate of the population coefficient F. By a 
“median” estimate is meant one for which the chance of it exceeding the true 
V is 0'50. In other words our estimate is the 50 % fiducial limit. Hence from (8) 
the required estimate is 

p5l8{f,t„0-5), (39) 

where = = 1-9231 and / = 24. ^ 

12/^/= 2-4495. 


We have 
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From Table IV, for e = O-S, we obtain by linear interpolation* with reapect to 
y' and 12/^/, A = 0'0197. Hence from (33) 

M)231--(()-()l!)7)(l'0379) = 1'9027. 

Therefore from (39), v,„ = 2-(>28. 

Example 2. A sample of n == 10 measurements of a normally distributed 
character x is given, The mean x is 4-7 and the standard deviation s is 0'2. Obtain 
10 % fiducial limits for the proportion P in the population exceeding a; = 5-0. 

As is shown in Section 6 this is equivalent to finding limits for C/ = (5*0 - g)/(r. 
We calculate first u = (5-l}-x)ls = 1-6. Substituting this into (16) the upper and 
lower 10 % points for V are given by 3(9, 4'743, 0'9 )/a/ 10 and 3(9, 4'743, 0T)/„/10. 


We have 


H- 


1 

% 


1 - 6 ; 



0’6667. 


From Table IV for e = OT we obtain by linear interpolation* 

A(9, 4-743, 0-1)= 1-347, 

and hence from (33) 

3(9,4-743,0-1) = 4-743 -(1-347) (0-6667) = 2-723. 


To obtain 3(9,4-743,0-9) we note by (30) that this is equivalent to minus 
3(9, -4-743, 0-1), The corresponding A is obtained by entering the same table as 
before, with the same value of y, but in the part of the table for which Jo is negative. 

We obtain -4-743,0-1) = 1-197. 


Hence 3(9, -4-743,0-1)= -4-743- (1-197) (0-6()07) =- C-538, 


and therefore 3(9,4-743,0-9) = 6-638. 

The upper and lower 10 % limits for V are therefore 6-r)38/.y/10 and 2‘T2lijflO, 
i.e. 2-068 and 0-861. The corresponding limits for P are 0-019 and 0-195. 

Example 3. Suppose it is required to estimate fiducial limit.s for the value L 
of X in the normal population which is such that the proportion of the population 
exceeding i is 10 %. Consider the case whore the sample size is n = 10 and the 
limits required are the 5 % limits at each end. From (18), we require 


where 


* + fcoo6« a-*!*! » + 

, t(9,Vl0l7,6) 

flQ ‘ 


(40) 


and U is the normal ddt-iate exceeded with probability 0-1; i.e. U ~ 1-2816. We 
therefore require /(9, 4-063, 0-05) and J(9, 4-053, 0-95). In the first place these will 
be derived from Table IV by the inverse method described above. 


* For a note on the accuracy of linear interpolation sec the end of Section 9. 
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Following out the successive steps to derive i(9, 4'053, 0'06) we obtain 
4-053 + (1-6449) (1 + 0-9126-0-1503)^ 


(i) t, = 


(1-0-1603) 


7-340, 


(ii) 2/^ = ( 1 + 1^ 0-6004; Ai= A(9, 7-340, 0-06) = 1-6803, 


(iii) 


4-063 + (1-6803) (1 + 0-9126 - 0-1569)1 ^ ^ „ 

/I -I eoew 7*448, 


(1-0-1669) 


-1 


(iv) ?/a = (l+^j ’-0-4960; Ag^ A(9, 7-448, 0-05) = 1-6800, 


i3= 7-447. 


To obtain i(9, 4-053,0-95) we require first 1(9, -4-053,0-05). The successive 
approximations to this quantity are 


- 4-053 + (1-6449) (I + 0-9126 - 0-1503)1 
(1-0-1503) 


2 - 200 , 


(ii) y'l = + 

(iii) la = ”2-249, 

(iv) y% = == 


1(9, 4-063, 0-95) is equal to minus 1(9, - 4-053, 0-06) and therefore equals 2-250. 

Hence from (40) 

7-447 , 2-250 

~^Io" ~ ^0-86- 

The above calculations have been performed using Table IV and the inverse 
method described in the previous section. Actually this example could have been 
treated much more simply by Table V. We shall proceed to show this, but it 
must be remembered that Table V only covers cases where e = 0-06 and e = 0-95. 
For other probability levels Table IV will have to be used. 

To obtain 1(9, 4-053, 0-05) and 1 ( 9 , -4-053, 0-06) Table V has to be entered with 

° - 1 + ^1 , where iJhas the two values + 4-053, i.e. has to be entered with 

v(2/)\ 2// 

0-6908. These give values of A equal to 1-6800 and 1-6925 and are the same 
as the final A’s obtained in the iterations. We have then only to perform the 
calculations for 1 by substituting in (37). 
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9. SlTMMARY 


An account of some of tlie applications of the non-central ^distribution* has 
been given , Tables of the distribution have been provided together with numerical 
examples of their use. 

If ifl and S are conneoted by the equation P(i > ip j ^) = e, then it was found tliat 
a good approximation to S, given to. is 

where K, is the normal deviate exceeded with probability e. The equivalent 
approximation 



is good for calculating Iq given d. These apiiroxiniations were seen to have ad- 
vantages over other ap])roximation8 also based only on the normal distribution. 

Tables IV and V at the end of the paper make possible a more exact determina- 
tion of d given and given d, respectively. In those tables a quantity is given 
such that 


and so 



)■ 


~2fl 

li -'h 


rn; 



Note on the accumcy of the tables. In the greater part of the tables is 
correct to as many figures as are shown. Occasionally, however, the values of 
Aj may be almost 2 units wrong m the last figure, It was nevertheless con- 
sidered worth while to give all the figures shown. 

Throughout Table IV linear interpolation will always give a result not more 
than a | unit wrong in the second last figure. This is also true of Table V, 
except between y' = -0*8 and -TO for/ = 9 and 16. Here linear interpolation 
with respect to tj' may be l unit wrong in the second last figure. 

♦ i!=(z+S)/v'w, where « is a unit normal deviate, w is distributed independently as xV/> “'"’d / 
is the number of degrees of freedom of 
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Values of X(f, tg, e) 
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1-5744 

1-6952 

1-6141 

1-0307 

1-6449 



For note on nocnracy of table see end of Snction 9. 


APPJSNDIX 

0?i Ihe calculaliim of Table 1 1' 

Before (UioidiiiK on tho method to bo employed in eaieulating tlio tables a wide variety of 
methods wore considered. Wo examined tlwse with a view to finding one which would 
best corabims tho attributes of spoodinesa, lack of opportunity for numovieal errors, and 
adaptability to moderately large-scale work. In this appendix will bo described tho more 
promising of tho methods tested, including tho one finally adopted. 

The non-oontral I being defined a.s in (i) wo have for the probability that with the 
notation of Section 2, 

J“ '-“-'■H'* 

v'/ 

We want to find pairs of values S and such that P{f, d, tfj has specified values ej i.e. we 
want to evaluate the quantities l^lf, S, e) and e). 





N. L. Johnson and B. L. Welch 


387 


(i) Direct Methods 

These are based on the idea of evaluating the right-hand aide of (41) directly for various 
values of S, f, and ta and then obtaining the quantities required by a process of inverse 
interpolation. An obvious way to evaluate P{f, S, Q is by quadrature. For small even values 
of/, however, the following formulae: 


P(0<«<co|/, 


1 

Jo 


e-W-Sfdu, 


l(/-2) 




1 

where Hh,{x)=.- u‘6~i‘'‘+‘‘^du 

«lJo , 

were found to be more convenient to use. The Hh, functions have been tabled by J. R. Airey 
(1931). 

This modified procedure was still unduly lengthy and not well adapted to large scale 
work. It provides, however, a useful check on values obtained by other methods. 


(ii) Solution by means of a differential equation 
For fixed values of/ and e the equation P(/, d, tg) = e may be regarded as specifying S and 
tg as implicit functions of each other. This implies that for constant e, 


/ 8P(/,S,tg) \ 
di"'! dS / dtg I 
From (41) this is found to give 

§=(/+*?)*//( -<^to[/+«SH. 

where = 

It may be shown that ^ ~ /'^(*) ~ ~ ^ • 


,(42) 


^ (43) 


For given values of/ and e, the value of tg corresponding tod = 0 can be found in a straight- 
forward manner since it is simply a probability level of Student’s t. Starting from this 
initial value we can solve the differential equation (42) numerically and obtain the values 
of <„ corresponding to non-zero values of S. A trial of this method was made, taking / = 6, 
and e = 0-06. The differential equation was solved by the Adams-Bashforth process. The 
values of higher derivatives of t with respect to d at the point 5 = 0 are required in this 
process. By means of (43) the formulae for these quantities were obtained in quite a simple 
form. 

This method gave promising results — it was suitable for large-scale work and there was 
comparatively little chance of inaccuracies entering the work — but it suffered from the 
defect that we could proceed to build up our table by small increments only of S (actually 
increments of O'Ofi). As we wanted to be able to deal with large as well as small values of S 
we decided that the process was not sufficiently speedy for our purposes. 

It may be remarked, however, that a similar type of approach might be quite suitable in 
cases where the expression for the differential coefficient is simpler than was ours. 
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(iii) Solution baaed on nonml approximation 

The method finally adopted for calculating the greater part of the table is based on the 
use of Edgeworth’s devolopmnnt of the normal probability function. In a recent paper 
E. A. Cornish & R. A, Fisher (1937) have given formulae, derived from thi.s dovelopm(<nt, 
which make it possiblo to find percontego levels of a probability function when the cumulants 
of the function are given, These formulae aro of most practical use when the probability 
function is itself nearly normal in fonn. With markedly non-normal distributions, the 
development may not converge, or at least more terms may bo required than is practically 
convenient. 

The ouraulants of the nou-eontral ^-distribution are not difficult to find. Tho first three aro 
given in equations (19), (20) and (21) and tho values of for different/ and d aro given in 
Table I. It was pointed out in the discussion of Section 6 that tho non-central <-distrlbution 
would often bo markedly skew. It wovdd appear, therefore, that it may not bo a suitable 
distribution to express by a development of the normal fimetion. Fortunately it is possible, 
as is pointed out in equation (20), to oxprass problems demanding tho quantities <5, e) and 
S{f,t^,e) in a form such that the solution can be based on a distribution much more nearly 
normal than that of nou-centrul t. 

It was shown that tho statemnnt 

was equivalent to the statement 

P(7<^|/,g = e-, 

whore (-z)-l-<oAl/'“*- 

(Here 15 is a unit normal deviate and x w <listributod in tho ,\'-distribution with / dogroes of 
freedom.) The problem of linding !S{J,t^,e) is therefore ocpii valent to finding a percentage 
level of Y. 

The cumulants of Y may bo obtained immediately from the formulae 

K,{ Y) = K,{ -2) KAX)' («) 

In particular for r = 2, ac,( F) = 1 + j(y), 

while for rd:2, k^Y) ^/-iH'aKAx)- 

Hence for all r tho shape coefficients yA Y) incrooso from 0 to y,(x) as <0 iccrooHcs from 0 to co. 
They 's of Y ore therefore always smaller than those of x. Hut theso latter are quite small oven 
for small/. Hence tho distribution of Y can never bo far removed from normality. Tho distri- 
bution of Y is therefore suitable for development by Edgeworth's method and Uoruisli & 
Fisher’s formulae may conveniently bo used to provide percentage levels. 

In this approach the problem of finding dIJ, e) appears as more fundamental than that 
of finding f(/, 5, e). Hence Table IV is designed to facilitate the direct calculation of S{f, fj, e) . 
Its use to provide f(/, S, e) demands a process of iteration which i.s not however difficult. Tho 
actual quantity A(/, t^, e) which is tabled differs by a small amount from Kg, the normal 
deviate which i,s exceeded with probability e. This difference is given by the Oomish-Fishor 
formula. All the terms in this formula were used. This necessitated the use of the first six 
cumulants of x- A method of obtaining these has already been described in a previous note 
(N. L. Johnson & B. L. Welch, 1939). 
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Tho groafcer part of ttio computation of Tables IV and V has been the work of one of us 
(N. T^. Jolintioii)* The rotnainder and much aubHidiary cheeking work i« due to Miss Catherine 
M. Tiiompson. We gratefully ackiiowhalge her assistance. 
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MISCELLANEA 

(i) Stirling's formula with remainder* 

By E. V, HUNTINGTON, Harvard Uaivowifcy 

Wo OHSumo tlio following theorem os known: 

n ! = er^ieF], 

where n, = 2, 3, 4 and 

P-± + I 

“l2n 360n“ 126bn^ I680?i’ 1188n® 360360nii 
= 0'083, 333n-‘ - 0'002,777n-» + 0-000, 794n-« - 0-000, 695?r’ 

+ 0-000,842n-“ - 0-00 1 ,9 1 Bn . 

Wo assume also that the error involved in stopping tho series at any point is loss (in 
absolute value) than tho first term dropped. (For proof, see, for example, Bl. B, Wilson, 
Advanced, Gahulua, p, 466.) 

Putting Q = e'', and expanding and collecting terms, wo lind: 

n ! = ,^{2n)^lnn<'e-^[Q], 

u /I . 1 1 139 671 

12n 288n« 61,840n* 2,488, 320n* 

103,879 _ 6,240,819 634,703,631 

■*' 209,br8,?80n'“ f6,246,796,800n'' ” 902,96T,66Ebb()n’ 

4,483,131,269 , 432,201,921,612,371 

“86, 684,309,9lX00On8 '*'614,904, 800,886, 78^b00n'> 

= 1 -t- 0-083,333,333n-H 0-003,472, 222n-« 

- 0-002,681,327n-3 - 0-000,229, 472n"' 

+ 0-000,784,039n-‘ -t- 0-000, 069,728n-» - 0-000,592, 1 OOn-’ 

~ 0-000,061,7 18n-“ -f 0'000,839,499n-“ -f . . , . 

(For the terms up to and including n"’, compare, for example, H. T, Davis, Tables, 1, 180.) 

Let (Jj = the sum of the Q -.series up to and including tho term in n~*, Q, = the sum of the 
(3 -series up to and including tlio term in n“’, etc. 

Then we find: 

13 = 1 + El, where 0-083,33n-i < fij < 0-086, 07n"L 
Q-Qi + jHi, whore 0-002,07n“» < < 0-003,48n~*. 

Q = Qj + Hj, whore - 0-002, 80)r’ < < - 0-002, 69n"“. 

Q = Q 3 -h!li, where -0-000,2371"* <J? 4 < 0-000, 18n~*. 

0 = (3i + i? 5 , where 0-000,66n-'<i?t< 0-000,82n-‘. 

Q = Q 5 + H 5 , where -0-000,24n-“<2?,< 0-000, 07n“®. 

Q = ^0 + whore - 0-000,62a-' < i?, < - 0-000,39n"L 
Q = (3, + H6, where - 0-000, 00n.-8< -Kg < 0-000,3971-8. 

QsQg+JSg, where -0-000,01n-"< Eg < 0-000,88n-‘. 

* Presented to the American Mathematical Society, 8 April 1939. 
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(ii) A note on the interpretation of quasi-snfficiency 
By M, S. BARTLETT 

1 . It has been shown by Pisher tliat if in problems of location we confine our attention 
to .samples witli tho same configuration G, where Q denotes the diffarenoes between the 
observation.s, then no statistical information in his sense is lost whatever r 0 a.sonable statistic 
T we ohoo.so os an estimate of tho unknown parameter d. This result follows for probability 
laws of tho type under consideration, 

p{x\e)=f{!B-e)dx, 

from the relation p{S \0)~p(T\G, d)p(G), (1) 

whore S denotes the sample of observations. For the conditional statistic T | G—a, random 
or statistical variable T varying for given or fixed C — I have when a relation like (I) exists 
used the term quasi-sufficimt, in view of the correspondence of (1) with the relation for a 
Huflfioiont statistic T, 

p{S\d)=p(T\d)p(S\T). (2) 

Conditional statistics also occur in the theory of more than one unknown, when the fixing 
of some statistical variables has the effect of eliminating unwanted parameters from our 
distributions. Confining our attention, however, to problems which are primarily problems 
in one unknown only, we require to examine relations of the type (1) further, in view of some 
recent comments by Welch* on the extent to which any conditional statistic like T | 0 can 
claim to be sufficient, 

Prom Fisher’s theory of information, it follows that when estimating & from two or more 
samples, we should naturally weight the different samples according to the information 
1(G) in each. But Welch has pointed out that if we are concerned with interval estimation 
of 0 from a single .sample, or with the moat efficient statistical test associated with a value dj, 
it does not appear that all our information, in the widest sense, is retained in T | O’. If, for 
example, the only alternative to happens to be di, it is known that the appropriate criterion 
for discriminating dj and di isp(iS' 1 0i)lp(S 1 0„). While from equation (1) it follows that this 
criterion is equivalent to p( T | 0, di)/p( T | C, d,), the former criterion must be referred to the 
distribution of 8, not of iS | O', 

In partial answer to this criticism, it might be noticed that when considering S | O', we are 
not bound to choose tho same significance level a for each 0 observed. Suppose we choose 
a(O'i) = e for the first sample. If the second sample gave a different configuration G^, we 
might take aiOj) such that the power of the test was a maximum for the two configurations 
Oj and Ojjif we had chosen a(Oi) such that + Similarly for 0^ and so on. 

In the long run, the significance level adopted on this rule would be the average value 
jS?r{ar( Gr)), whei'e «,( Of) denotes the signifioance level taken for the rth sample when making 
the rth test. For any finite number of samples the average signifioanee level- adopted is not 
exactly equal to e, but tho level becomes equal to e in the limit; thus we could argue that, 
theoretically at least, wo should eventually reach the most powerful test for a given signi- 
ficance level merely from consideration of S | 0. In the above argument we allow the con- 
figuration distribution to be generated by the samples themselves. It is of course theoretical 
quibbling, for if we may assume that 0 has a definite distribution p{G) which is already 
known, wo should use this fact rather than wait for its confirmation by repeated sampling. 

My own answer to the query of how far T 1 17 can be considered sufficient is that it is only 
sufficient provided that we do really agree to consider samples with the same configuration 
as that observed. The rejection altogether of the factor p(<7) has two consequences, (i) we 
have seen that a test might conceivably be derived on the basis of p(0) more powerful than 

B. L, Welch. Annals Math. Statist. 10 (1939), 58-69. 
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fcho tost based on S’ | O' with fixed significance level oc(G) = e, (iii) the fixing of G implies that 
whatever selection foi? 0 there iniglit bo distorting the validity of tests based on if | 0 
will be unaffected. 

The indepondonce of the test from selection for 0 is the explicit advantage gained, 
though in addition it is possible that if oxu’ specification of the population is erroneous, that 
it is le.ss so for p( IT I 0) than p{S) j for example, if the configuration 0 i.s similar to the expected 
configuration on normal theory, it may prove more feasible to assume for a certain range of 
populations differing from normality that normal theory may still bo approximately sub- 
stituted for the unknown probability p{T \ G) than that wo may do so for p{S) or pIT), 
though 1 have not investigated this point. 

2. It is instructive to remember Fisher’s comparison of tno quosi-sufileiont statistic 
T I C with the sufficient statistic T, whore T may bo written in full, T | n, where n is the size 
of the sample. The equivalence of the two types of statistic when we regard G as fixed can 
be extended, in terms of our theoretical argument, if conversely wo assume that n had in 
seme problem a dollnito and known distribution from sample to sample. Consider similarly 
tlio test of significivneo of a regression coefficient. The ortliodox theory is to eon.sidor the 
conditional Btatistic b | S[x - x)^, where b is our estimate, and S{x — k)® the sum of sipiares of 
deviations of the independent variable x. Suppose for the sake of argument that the true 
variance of the residual xlepondent variable was known to bo unity, and the .■r’s are such 
that £(»-»)“= 1 on Mendays and b44 on Tuesdays, Thou for an 0-026 significance level 
(one tail), the usual practice would bo to take 1‘96 os tho .significance level for b (from bo = 0) 
on Mondays, and l-9(i/i‘2 =; 1-633 on Tuesdays. Tho power of tlio tost iu rtdatiou to tho 
alternative that = 3-92 is 0-9860. But if wo wore satislied with adjusting the Hignilioanco 
level to bo 0-026 merely in tho long run for Mondays and Tuesdays together, we may rai.so 
the power of tho test to its maximum value of 0-9878, by taking the Monday signifioaneo level 
at 6 = 1-87 (a = ()-0307) and the Tuesday level at 6 = 1-723 (a = ()-{)194). 


SUMMAllY 

In a, theoretical note on the interpretation of quasi-sufficient statistics in problems of one 
unknown parameter, it is agreed with Woleh that full sniliciency properties can only bo 
claimed if we deliberately confine our attention to conditional samples of tho typo observed; 
but it is pointed out that the resulting inferences are independent of any selection operating 
on the variables that have boon fixed. 


(iii) The cumulants and moments of the binomial distribution, and the 
cumulants of x’* for a (« x 2)-fold table 

By J. B, S. HALDANB, F.H.S. 

From the Department of Biometry, UnmrsUy College, London 

The first four cumulants of the distribution of for a (nx 2)-fold table when samples are 
finite, have already been given (Haldane, 1937). These and higher cumulants and moments 
can be calculated by a simpler method. Coasider a sample of s, tho probability of a success 
beingp, andp -f g = 1. Pearson (1919) pointed out that for moments of {p + q)‘ about its mean, 
the generating function is {qe^*+p6~‘<‘Y, and Romanovsky (1923) gave a recurrence formula 
for the moments. That for the cumulants is much simpler. 

Let (7 — ge*** -p pe""®'. 
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or 


Then the oumulant -generating function 

S “^^slogU. 

r=2 

To find the cumulants for s = 1, we note that 

- t, 

® Me”'- 6-“*) 

~K{t) = 

^ d 

So ~ K{i) +T3<. 

® » r die. 

Equating the coefficients of i’'/r !, we find 

/Cj, = pg, 

dKf 


and if r>2 

Let pg = 0 , p~q = ff; then 


Xf+l = P<i 


dq 


■( 1 ) 


..( 2 ) 


dK, dc dg „ , , , 

= Si’"' 5 "*' “ 

Hence if K^r^^fio), 

=0(1- 6o)/'(c) + c»( 1 - 4o)/"(c) . ) 

Prom these equations we can very rapidly calculate successive values of since = c, 
and find: 

Ki= 0, 

Ki= C, 


Ka~cg, 

Ki=: 0 — 6o^, 

Ks= g{c-W), 

Ka~ 0— 30o^ -f 120o®, 

/^,=: g(c-60cH360c®), 

Ka~ o-126(j“ -1-1,6800*- 6, OiOcS 
Kg- y(o-262oH5,040c»-20,160c<), 

*:io= c - 610c* 17, 640o*- 161,200c* -f 362,880c*, 

Kn =g(o- l,020o* + 52,920c“ - 604,800c‘ -f l,814,400o*). 

/Cij= 0-2, 046c* 4-168,9600* -3, 160, 080c* +19,968,4000* -39, 916,800c* (3) 

If each of the above cumulants be multiplied by s, the moments about the mean can now 
be calculated from the expressions given by Fisher { 1928) and Haldane (1938). If p = J we 

I*®'"''® K{t) = slog cosh lA 

s s s 17s 31s _ 691s 

so = '‘'*~~8’ '^*”'1’ '‘^*'’“”4”’ r~’ 

while if q is very small we have for the cumulant-generating function of a Poisson series 

K{t) = SC[6*-l~t). 
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The oooffleiont of c® in ~ + ( - 1 1" - 3]. So when q is small, but its squaru is not nogloctocl, 

tho first order corroction to the Poisson cumulant-gcnorating function is 

K{t] = 

Tho numoriottl coof(ici(>nt of tho highest j)OWor of c in is (r- 1)1 when r is ovtsn, and 
i(r“-l)! when r is odd. 

Consider a .sample oU, in which a successes are recordetl. Then 


\'’= 


(o~»p)* 

apq 


But a - sp is tho doparturo from tho moan of tho binomial distribution [p + ?)'. Hence tho 
rth moment of the distribution of (for one degroo of froedom) about zero, is 



where /i^r is the 2rth moment of {p + q)’. 

But if p', and /f' bo tho rth moment about tlio moan, and the rtli cntnulant, of tho 
distribution, then 

Pt ^ oto„ < = p!i, oto. 

Making tho necessary .substitutions, wo find, for tho oumulants of x^ in terms of tliose. of 
tho binomial distribution: 

/fj =S {sc)~^K^, 

K'r, = {SC)■'^ZKl + K^), 

K'i ~ (s(!)-<[48/c.j + 48(5/rj^fi(+.3>fJ>r4) + 8(4/cJ + 7/fj/f{ + 3/C5/f8) + /C8l, 

/f ' = ( 4 (;)~'[ 384 /c 5 + SOOiS/fJ + 2/09^4) + 80 ( 26^9 /C4 + IOa:, kJ + 28 »f j k, 

q- 6^9 >C 4 ) 4 “ 2(63^8 4 " 100 ^ 4 /Tg 4 “ OO/Cg a.*^ 4 " 4 * ^ 10 ], 

k'„ = (.w)~'>[3,840a.2 4- 9,600(10kS aK + 3/c^ 4 : 4 ) 4-4,800(3^3' 4- 25a'3 

4- 8 A 9 aJ 4- 14a, 5 Aj As 4- 2 a 3 Aj) + 40( 132aJ 4- fl72A8 A 4 a, 4- 189aj a, 

4- 226aSa8 4- 3 OOA 3 Aj Aj 4- ISOajAjAj-PSOAjAj) 4- 4(113a'o 4- 198as A, 

4- I 2 OA 4 A, 4- 66 A 3 Ag 4- 16 AjKio) 4- Aul, (4). 

We now substitute tho values of a, given in equations (3) multiplied by a, putting 

k = (pg)"‘= c~‘. 

We therefore have, for the oumulants of x^ with one degree of froodom; 

Ai= 1, 

Aj= 24-(A;~6 )s“S 

AjS 8 4 - 2( I Ifc - 60) 8-‘ 4 - (fc» - 30fc 4 - 120) a'*, 

A 4 w 48 -I- 96(4fc - 19) s-i 4- 16(71:* - 126fc 4- 420) s"* 4 - (!:« - 1261:* 4- 1,6801: - 6,040)«-*, 

As = 384 4 - 960(71: - 32) 4- 400( 161:* - 2141: 4 - 648) 8“* 4-6(811* 

~ 3,9081;* 4- 38,4201: - 98,496) 8'* 4 - (1* - 6101:* 4 - 17,6401* 

-161,20014-362,880)8-', 

Ke= 3,8404-9,600(131-68) 8 -' 4 - 9,600(261*- 3271 4 - 924) s'* 

4- 40(1,7291* - 66.2361* 4- 4694)241- 1,066,792) 3 -* 4- 4(5011* 

-69,3981*4- 1,289,2441* -8,824,32014- 18.666,840)8"* 

4- (1* ~ 2,0461* 4- 168,9601* - 3,160,0801* 4- 19,968,4001 

-39,916,800)8'*. 


( 6 ) 
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When p = I, k=s i, and we have, for n degrees of freedom: 

Ki= n, 

/fj= 2ns->(s-l), 

/fjs 8ns“’(B- 1) (s-2), 

/c^= 16ns-%?-l)(3««-16s+17), 

~ 128ns-'{8 - I ) (a - 2) (382 - 21a + 31). 

ATas 256ns-'(s-l)(16s‘-210s2 + 990s9-l,960s+l,382), (6) 

If there are n samples, with different values of a, we have, for the oumulants of ;^2, where 

h = , and Et = Ss~\ 

K^-n, 

/f 3 = 2n[l + (li'-3) RJ, 

= 4w[2 + (llh - 28) JJi + (li* - 16/1 + 30) fi,], 

Ki = 8n[6 + 12(8/i- 19) I?i + 4(I4/t2 ~ 126A+ 210) iJj + {h^ - BS/i® + mh - 630) JJ,], 
16n[24+ 120(7A- 16) iJi+ 100(16A2- 107h+ 162) E,+ 3(81/i= 

~ 1,964182 + 9,560A - 12,312) E, + (h* ~ 266/8® + 4,410/8® - 18,900/8 + 22,680) iJJ, 
»»<92wfl20+ 600(13/8 - 29) iJi + 600(62*2-327/8 + 462) J?j + 10(1,729*® 

- 28,118*2 + 114,766* - 133,228) JJa + 2(601*^ - 29,699/8® + 322,311*2 
- 1 ,103,040* + 1 , 1 69,740) J?4 + (*' - 1,023*® + 42,240*® - 396,010*® 

+ 1,247,400/8- 1,247,400) J?j] (7) 

When p = 2 = i, we have: 

»ci=n, 

/fj= 2(w-J?i), 

K 3 = 8(n-3Ili + 4iJ,), 

16(3n-18Ei+32fJj-17i?3), 

Ki, = 128(3n - 30^1 + lOOiJj - 136i?3 + 62Rt), 

K, = 266( 16n - 2252Ji + 1,200713 - 2,940713 + 3,332714 - 1,382715) (8) 

The first four of equations (6, 6, 7, 8) have already been given in a slightly different form 
by Haldane (1937). The limiting forms of equations (6) and (7) when a tends to infinity and 
k to zero, while ks = g, have been given by Haldane (1938), However, the expression for 
there given is incorrect. The coefficient of R^ in the expression for a:, should be 124,800. 

The extension of equations (7) would be rather tedious. However, those of equations (6) 
and (8) would not be very difficult. The coefficient of a;®’'/2r ! in the expansion of log cosh f is 
the value of {djdx)^‘'-^l-tmV ai) when * = 0, and can easily be calculated, since this 
differential coefficient is a polynomial in tanha;. The equations for moments in terms of 
opmulants can easily be extended when all odd cumulants vanish. In this case a useful check 
caih’bp obtained from the fact that when a =; 2 the oumulant-generating function of 
is t+ log cosh h 

SUMSIAEY 

Expressions are obtained for the first twelve cumulants of the binomial distribution, and 
a simple recurrence formula for further cumulants. The first six oumulants of foJ" * 
(n X 2) -fold table when expectations are small, are deduced. 
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CAUSATION AND CORRELATION 

(iv) The principles of the mathematical theory of correlation. By A. A. 
Tsohopbow. Tran.slatod byM. Kantobowitccii. Wm. Hodge & Co. Ltd. Price 
12a-. fid. 

This book is a translation of an enlarg(Hl roproduction of leoturm dolivnrod by Profe.s,sor 
Tachuprow at the Univorsity of Oslo, lind originally printed in Gorman about a docndn 
ago. It will bo of inbsrost to many people in that it is a cf)m[)l(!tn survey of correlation 
theory and its underlying priuoiples. Tho theory is expounded along th(5 now familiar 
classical lines which had their origin in Karl Pearson’s writings, and although the appli- 
cation of this theory to iiroctiCal problems is porhaiw a little out of dab^, it still forms a 
necessary background which tho sUulont must acquire, and of which a pruper understanding 
will always bo ossontial. 

Tlio book as a whole shows an aatouishmg “patchimws” in writing. It may ho the 
fault of tho translator, but the fact retnaita that the meaning of whole paragraphs is 
sometimes very obscure, while at other times tho ease and lucidity with which arguments 
are presented are unrivalled by any comparable treatise. This unovennoss is \mfort\mato, 
since the book will bo road more for tho exposition of underlying principles than for its 
algebraic development of tlio theory; indeed it may bo questioned whether those who 
have not previously mastered the olomonts of statistical theory and probability will obtain 
any profit from its reading. There is no clear disouasion of probability and what it means, 
and tho reader is left to find out from examples how a probability may ho calculated. 
This cannot be deemed a fault, bub seems to point to tho book boing useful only if previoas 
knowledge of tho subject is obtained elsewhere. 

Tho chapter on stochastic connexion and functional relationship is good, and may be 
read with advantage by any statistical worker and teacher. The motliod of approacli is 
the same as that of the present day, and will doubtless bo followed for many years to 
oome, It is, however, a little astonishing to find that tho translator makes no use of tho 
tenn "random variable”, which has long passed into comrnon use. The general disoussion 
throughout the book and in particular m tho oliapter entitled "Object and value of Corre- 
lation Measurement” is stimulating, oven if tho roatlor is not always in agreement with tho 
author. The raathomatioal development of tho theory is sot out with clarity and freshness, 
which make it enjoyable reading. The notation is a little ourabersome, but, once mastered, 
it does not prove difficult to follow. 

Taken as a whole, this book is a worthy contribution to correlation literature, and it is 
surirrising that no translation has been published until this date. It certainly should be 
read by all who attempt to gain an understanding of statistical theory. 


F. N. DAVID, 




